Cornell Language and Technology

exploring how technologies affect the way we talk, think and understand each other

Tuesday, April 04, 2006

Assignment #8 - Option 2

According to Kraut et al, people use visual information to obtain situational awareness and to help with conversational grounding. Situational awareness consists of seeing bodies and their environment, and using this information to keep a current knowledge of the status of the task. Visual information also helps with conversational grounding, which is the knowledge shared by speakers.

Kraut et al first acknowledge that it is too difficult for video systems to relay all of the visual information that we have in face-to-face settings. Instead, they wish to pinpoint the specific visual cues required to perform group tasks, acting under the assumption that if these important cues are presented through video, the group task will be more likely to succeed. Kraut et al chose bicycle repair for their experiments, an activity which falls into the category of a mentor task: a task in which a person performs the physical actions while guided by the speech of another. Kraut et al divided the visual information in the experiments into the categories of heads/faces, bodies/actions, task objects, and environment. They observed the ways in which subjects used these categories in keeping track of task status and others' actions, identifying the focus of others' attention, communicating successfully and quickly, and keeping track of others' understanding.

Kraut et al address the problems in the current video system of "talking heads" - videoconferencing that only provides the cue category of heads/faces. They suggest that this limited visual information requires that people in videoconferences use the same language as with the telephone. With this in mind, their experiment compares task performance in the following three mediums: face-to-face, audio, and videoconferencing with a view of hands, actions, objects, and environment, but not heads.

The conclusion from both their experiments was that while the video system did not affect the final performance, it facilitated understanding between the subjects. My questions from the experiments are the following: What was the reason behind the experimenters initially deciding that helpers could only view objects that were in the worker's field of vision, rather than providing a fixed camera in the room? Why do you think that factors such as eye gaze are unimportant in conversational grounding? Do you believe that further research is possible in testing different views in videoconferencing to see which is best?

1 Comments:

At 12:05 AM, Blogger H said...

I think another interesting thing that the experimenters could have tried is to use visual cues alone. That is, let the helper see (via video) what the worker is doing, but mute the sound. But allow the helper to help the worker by talking to him/her on a phone. This way, instead of comparing what added benefits visual cues will have on top of audio cues, they'd be studying the advantages of visual cues versus the advantages of audio cues. I think the results could be quite interesting.

 

Post a Comment

<< Home