11 January 1998

Activity theory and Cognitive Sciences

By Henrry Rodriguez, henrry@nada.kth.se

Introduction

In this paper I will compare two approaches: Activity theory and Cognitive science in the field of Human computer Interaction. I will not give a very detailed description of the two approaches. I do this because this paper is the term paper of the course 'Activity theory in HCI' given in Nov. 13 and Nov. 20 by Victor Kaptellinin.

If we want to understand what a person does, we first have to know in which context that person is. Now that I say that comes to my mind a technic that is very often used by film makers, they start the film with a scene that does not have any meaning for the public, it just fills them of question and curiosity. What film makers are doing is just capturing the public's attention. The development the film then explaining the global context, in which the first action was done. This is a very simple example to prove how important it is to understand the context that any action takes place. If we can understand the context that means that we could have an idea of that which actions can be performed and in which sequence. In the field of HCI to know this context is very important and to get a description of that context is always hard to find because attempting to have a good description of that context we could include a lot of information that is not relevant for the design of an interface also we could very easily no recognize a situation that could be very important and that could change the context if we do not give to this situation the value that it can have. The combination of psychology, ergonomics, and computer technology has generated an area of interdisciplinary knowledge known as 'Human-Computer Interaction' (HCI). There are guidelines to assist designers, who no longer have to rely on guesswork or personal experience and expertise to decide between possibilities.

Activity Theory

Activity theory originated in the former Soviet union as part of the cultural-historical school of psychology founded by Vygotskij, Leontjev and Lurija. The theory is a philosophical framework for studying different forms of human praxis as developmental processes, with both the individual and the social level interlinked.

In activity theory the unit of analysis is an activity that is being composed of subject, object, actions, and operation. A subject is a person or a group engaged in an activity. An object is help by the subject and motivates activity. 'Behind the object there always stands a need or a desire, to which [the activity] always answer.'

Activity theory has five principles:

  1. Object-orientedness
    Actions are goal_directed processes that must be undertaken to fulfil the object. They are conscious because one hold a goal in mind and different actions may be undertaken to meet the same goal. Every activity is directed toward something that objectively exists in the world, which is an Object. The notion of object is not limited in Activity theory to the physical, chemical, and biological properties of entities. Socially and culturally determined properties are also objective properties that can be studied with objective methods. Objects can be transformed in the course of an activity and do not change on a moment_by_moment basis. There is some stability over time, and change in objects are not trivial: they can change the nature of an activity. The human activity is guided by anticipation. This anticipation is a motive of the activity, the goal of the action and the oriented basis of the operation, respectively. The anticipation of futures events is the fundamental principle of anticipatory reflection as developed by Anokhin. The classical example of anticipatory reflection is Anokhin's rethinking of Pavlov's discovery of the conditioned reflex:When a dog salivates in response to the ringing of a bell, it is not because saliva is needed for digest the bell but because the dog anticipates food to appear in the future which has to be digested. When the activity is performed there is a feedback mechanism which compares the result of the activity with the prediction and any incongruence (I.e a breakdown) gives rise to a learning situation (I.e. the experience of the person is expanded)

  2. Hierarchical structure of activity
    Actions are similar to what are often referred to in HCI literature as task . According to Leotijev, interaction between human begins and the world is organized into functionally subordinated hierarchical levels. According to Leontijev there are three levels: activities, actions, and operation. Each action performed by a human being has not only intentional aspects but also operational aspects and the most fundamanetal level of operation is an adaptation to the physical aspects of the user interface. Operations becomes routinized and unconscious with practice and they depend on the conditions under which the action is being carried out. That means that operation are orientate in the world by a non-conscious orienting basis of the operation. This orientation basis is established through experience with the concrete material conditions for the operation, and is a system of expectation about the execution of each operation controlling the operation, in the process of the activity. Activity theory holds that the constituents of Activity theory are not fixed but can dynamically change as conditions changes. All levels can move up and down. An operation can became an action when 'conditions impede an action's execution through previously formed operations'. For Activity theory object remain fixed, but goals, actions, and operations change as conditions change so we see a flexibility aspect in Activity theory.

  3. Internalization/Externalization
    B.F. Lomov points that any activity has an internal and external side and they are related without any gap to each another. The division of activity in Internalization/Externalization is really artificial. Any external activity is supported by processes that are originated inside the subject and internal process appears in one way or another in the external world. The most important for Activity theory should be not make this artificial division and the find out how they are related to each other but while studying the 'external side' of the activity discover the 'internal side.' According to Vygotskij internalization is social by its very nature. The range of actions done by a person in cooperation with others comprises the so called 'zone of proximal development.' Externalization is the opposite of internalization. Mental process manifest themselves in external actions performed by a person, so they can be verified and corrected, if necessary. Activity theory emphasizes that it is not just mental representation that gets placed on someone's head; it is holistic activity, including motor activity and the use of artifact.

  4. Mediation
    Human activity is mediated by a number of tools, both external and internal. The mediation is done by artifacts which broadly define and include instruments, signs, language, and machines, mediate activity and are created by people to control their own behavior. Artifacts carry with them a particular culture and history and are persistent structures that stretch across activities through time and space.

  5. Development
    Activity theory requires that human interaction with reality should be analysed in the context of development. The activity itself is the context. Context is constituted through the enactment of an activity involving people and artifacts. Context is both internal to people and at the same time external. The crucial point is that in Activity theory external and internal are fused, unified.

Cognitive Science

What is Cognitive Science?

Cognitive Science is a rapidly expanding field of study aimed at understanding the mental processes that underlie cognitive abilities. The questions asked by Cognitive Science are not new. Philosophers, Psychologists, Linguists, Neuroscientists and Computer Scientists have all approached the basic questions posed by the nature of mental processes in their own ways as part of the broader endeavours of their respective fields. Cognitive Science is distinguished from these traditional disciplines by its highly interdisciplinary approach. Its defining technique is to bring expertise gained from the related disciplines to bear on a set of common questions: What are the basic components of cognitive processes? Are they subsumed by a common mental mechanism? What is the relationship between the physical apparatus and cognition? To answer these questions Cognitive Scientists engage in empirical studies aimed at assessing their formal and computational models of various aspects of cognition. The sorts of areas investigated include the information-acquisition and information-processing mechanisms underlying cognitive abilities like perception, recognition, information storage and information retrieval, language acquisition, comprehension and production, concept acquisition, problem solving, and reasoning.

Since the Seventeenth Century, the development of a unified science of the mind has been frustrated by the fact that questions about perception, thought, memory, imagination, language comprehension and learning, and other mental phenomena fall under the purview of several distinct sciences, each with its own methodology, conception of explanation, and preferred set of explanatory models. Until recently, most psychologists, philosophers, computer scientists, linguists, and neurobiologists have been content to pursue these questions in relative isolation, awaiting, it seems, the arrival of some modern-day Newton of the mind. In the last two decades, however, the gradual emergence in each of these disciplines of some version of the view that mental phenomena can be fruitfully understood as operations on symbolic representations and that the mind is thus, in some sense or other, an information processor, has made possible a truly interdisciplinary approach, cognitive science, that holds the promise of being the long sought unified science of the mind.

Cognitive science is not really new, the phenomenon of thought and language are points of interest for philosopher and scientist since a very long time ago. Cognitive sciences need to be distinguished from cognitive psychology, which is the branch of traditional psychology dealing with cognition. Although cognitive psychology constitutes a substantial part of what is seen as Cognitive sciences, it follows specific methodological principles that limit its scope.

In the field of Cognitive sciences there are some approaches to modelling the interaction between a user and the device. The GOMS (Goal, Operators, Methods and Selection and the production system model) are know as 'process' models because attempt to supply a simple model of the mental processes involved in using an interface, including remembering items, starting a new subgoal, etc. They yield predictions which are less quantitative in nature, based perhaps on how many items must be simultaneously retained in working memory or on other measures. Both models characterize the knowledge necessary to performance routine tasks like text editing.

The GOMS Model

The GOMS model represents a user' s knowledge of how to carry out routine skills in terms of goals, operations, methods, and selection rules. GOMS describes the operation of the interface in terms of a 'state space.' The users goal is to achieve a particular state; each available operator takes the user to the same state or a new state, in which different operators will be available.

Goals represent a user's intention to perform a task, a subtask, or a single cognitive or a physical operation. Goals are organized into structures of interrelated goals that sequence cognitive operations and user actions.

Operations characterize elementary physical actions (e.g., pressing a function key or typing a string of characters), and cognitive operations not analysed by the theory (e.g., perceptual operations, retrieving an item from memory, or reading a parameter and storing it in working memory).

A user's knowledge is organized into methods which are subroutines. Methods generate sequences of operations that accomplish specific goals or subgoals. The goal structure of a method characterizes its internal organization and control structure.

Selection rules specify the conditions under which it is appropriate to execute a method to effectively accomplish a specific goal in a given context. They are compiled pieces of problem solving knowledge. They function by asserting the goal to execute a given method in the appropriate context.

Content and Structure of a User' s Knowledge The GOMS model assumes that execution of a task involves decomposition of the task into a series of subtasks. A skilled user has effective methods for each type of subtask. Accomplishing a task involves executing the series of specialized methods that perform each subtask. There are several kinds of methods. High-level methods depose the initial task into a sequence of subtasks. Immediate-level methods describe the sequence of functions necessary to complete a subtask. Low-level methods generate the actual sequence of user actions necessary to perform a function.

A user's knowledge is a mixture of task-specific information, the high-level methods, and system-specific knowledge, the low-level methods. The knowledge captured in the GOMS representation describes both general knowledge of how the task is to be decomposed as well as specific information on how to execute functions required to complete the task on a given system.

Cognitive Complexity Theory

Kieras and Polson (1985) propose that the knowledge represented in a GOMS model be formalized as a production system. Selection of production systems as a vehicle for formalizing this knowledge was theoretically motivated. Newell and Simon (1972) argue that the architecture of the human information processing system can be characterized as a production system. Since then, production system models have been developed for various cognitive processes (problem solving: Simon, 1975; Karat, 1983; text comprehension, Kieras, 1982; cognitive skills: Anderson, 1982).

An Overview of Production System Models

A production system represents the knowledge necessary to perform a task as a collection of rules. A rule is a condition-action pair of the form
IF (condition) THEN (action)
where the condition and action are both complex. The condition represents a pattern of information in working memory that specifies when a physical action or cognitive operation represented in the action should be executed. The condition includes a description of an explicit pattern of goals and subgoals, the state of the environment, (e.g., prompts and other information on a CRT display), and other needed information in working memory.

Production Rules and the GOMS Model

A production system model is derived by first performing a GOMS analyses and then writing a program implementing the methods and control structures described in the GOMS model. Although GOMS models are better structural and qualitative description of the knowledge necessary to perform tasks, expressing the knowledge and processes in the production system formalism permits the derivation of well motivated, quantitative predictions for training time, transfer, and execution time for various tasks.

Kieras, Bovair and Polson among others have successfully tested assumptions underlying these predictions. These authors have shown that the amount of time required to learn a task is a linear function of the number of new rules that must be acquired in order to successfully execute the task and that execution time is the sum of the execution times for the rules that fire in order to complete the task. They have shown that transfer of training can be characterized in the terms of shared rules.

Transfer of user skill

In a following section, research on transfer of user skills in human-computer interaction will be reviewed. This research shows that it is possible to give a very precise theoretical characterization to large transfer effects, reductions in training time on the order of three or four to one. These results strongly support the hypothesis that large transfer effects are due to explicit relationships between different tasks performed on the same system or related tasks performed on different systems. Existing models of the acquisition and transfer of cognitive skills enable us to provide precise theoretical descriptions of these transfer processes. These same models can in turn be used to design consistent user interfaces for a wide range of tasks and systems that will promote similar large reductions in training time and saving in training costs.

Discussion

People who work with computers extensively build up a repertoire of efficient, smooth, learned behaviours for carrying out theirs routine communicative activities. Yet the interaction is intensely cognitive. The skills are wielded within a problem-solving context, and the skills themselves involve the processing of symbolic information, there is always required the interpretation of instructions, the formulation of sequences of command, and the communication of these commands to the computer.

Susane Bker points out that the conditions that trigger a certain operation from the repertoire of operation are what we need to investigate in user interface design.

Terry Winograd points out that ' many difficult issued are raised by the attempt to relate programs to theory and to cognitive mechanism. Within the Cognitive sciences community, there is much debate about just what role computers programs have in developing and testing theories'. He says that Cognitive sciences will have important limitations in its scope and in its power to explain what we are and what we do.

Maturana in 1970 says that 'Learning is not a process of accumulation of representation of the environment; it is a continuous processor transformation of behavior through continuous change in the capacity of the nervous system to synthesize it. Recall does not depend on the indefinite retention of a structural invariant that represents an entity (an idea, image, or symbol), but on the functional ability of the system to create, when certain recurrent conditions are given, a behavior that satisfies the recurrent conditions or that the observer would class as a reenacting of a previous one.'

Winograd in his book gives a very simple explanation about that it is impossible to establish a context_independet basis for circumscribing the literal use of a term even as seemingly simple as 'water' through this example
A: Is there any water in the refrigerator?
B: Yes.
a: Where? I don't see it.
B: In the cells of the eggplant.

As we can see in the both approaches ( Activity theory, GOMS and CTA) try to give a framework for the design of interfaces in the field of HCI. Now I will try to explain some of the limitation for them.

For error-free behavior, a GOMS model provides a complete dynamic description of behavior, measured at the level of goal, method, and operators. Given a specific task, this description can be instantiated into a sequence of operators. By associating times with each operator, such a model will make total time predictions. If these time are given as distribution, it will make statistical predictions. But, without augmentation, the model is not appropriate if errors occur Yet errors exist in routine cognitive skilled behavior. Indeed, errors' rates may not even be small, in the sense of having negligible frequency, taking negligible time, or having negligible consequences. For skilled behavior the detection and correction of errors is mostly routine. It cannot be entire routine, since the occurrence of rare types of errors for which the user is unprepared is always possible. But in the main, errors are quickly detected and result in additional time to correct the error. The final effect of the behavior remains relatively error-free, and the behavior can be characterized solely by the time to completion. Thus, errors can be converted to variance in operators time, so that GOMS theory can be applied to actual behavior at the price of degraded accuracy. For a general treatment of errors and interruptions of the users, the hierarchical control structure of a GOMS model is inadequate; a more general control structure is required. The use of stack discipline GOMS model instead of a more general control structure, such as production system (Newell Simon, 1972), should be taken as an approximation especially appropriate for skilled cognitive behavior and preferred here because of its greater simplicity. The very limited degree to which this analysis involves any psychological process model can be assessed from the amount of reasoning behind it. The analysis is based on two basic principles of psychology. Firstly, that people act so as to attain their goals through rational action given the structure of the task, and secondly, that problem solving activity can be described in terms of a set of knowledge states; operators for changing states; and control knowledge for applying knowledge. Since 'Operators are elementary . . . Elementary processing acts, whose execution is necessary to change any aspects of the user's memory . . .' Card, 1978, p 58) the model has the potential for processing operations.

On the other hand GOMS uses only the knowledge in the design, and produces absolute estimates of performance time - 'it will take 3.45 second for a skilled user to perform this task using this system'

CTA is first and foremost a means of making relationships explicit between approximate knowledge representations and cognitive limitations on their mental processing, The characteristic and limitation are specified in terms of the properties of mental codes; restricted capabilities for coordinating and controlling processes which handle those codes: and more specific limitations such as recency and description effects in memory retrieval. The approach essentially provides a language in which such constraints can be specified. The language refers to processes and coded mental representations which can be described in terms of theirs attributes. In its present form, only a limited range of attributes and constraint are actually utilised. They can, however, be added to as further analyses of user performance and provide additional empirical justification for extending that range.

In my point of view I think that Activity theory has a lot still do give to HCI, I think that the most important here is that the elementary unit of study in Activity theory is the action and when we are interacting with a computer there are a lots of actions so a framework in which action is the main object of study as in Activity theory will give a lot to make easy the study of this field. One aspect that is very important is that Activity theory is not a rigid but flexible, it gives the possibility to go from level to other in both directions. When we are working with different users, and as far as I know there is no system done to be run for one specific person, we have to be very careful because it is not easy to put all these users into a frame and start to develop, that is why it is very important to have a flexible approach.

Nardi points out that the use of Activity theory framework implies

  1. A research time frame long enough to understand users' objects
  2. Attention to broad patterns of activity
  3. The use of a varied set of data collected techniques
  4. A commitment to understanding things from users' points of view.

If we use a model under the frame of Activity theory then we will have a model that:

In my opinion nowadays the possibility that AT gives to the field of HCI is still on process and that only with practice we will find out if working in this framework this filed will find a common and general way in its research. The answer to this will only be given within the time and with the use of this framework. It is one chance and a very wide set of possibilities and I do not see any reason why we should no try to start new ways in research, in fact that is what keeps science going on.

References

Maturana Humberto R., Biology of cognition (1970), p 45

Winograd Terry,.Understanding Computers and Cognition, 1987

Card Stuart K, The psychology of human-computer interaction,1983

Anderson John R. Cognitive psychology and its implications (4th edition) 1995.

Nardi Bonnie A. Context and consciousness,1996.

Guindon Raymonde. Cognitive Science and its implication for Human-Computer Interaction.
1988.
Van der Veer, Gerrit, Working with computers:theory versus outcome.1980.

Hughes J. Et al. (Eds) Proceedings of the fifth European conference on Computer Supported Collaborative Work, 'Plans as situated Action:An Activity theory approach to workflow system' Bardram Jakob, p 17-32, 1997.

Susane Bker, Through the interface, 'human Activity and Human-Computer Interaction', p 49-51. 1991

Shadrikov V.D. Activity psychology and capacity of the man, 1996 ( .. . 1996.)