SURL Home > Usability News

 

Editor's Notes

In our eighth issue of Usability News: 

If you have a research study that you would like to do, but don't have the time or resources to do it, please contact us. We will do the research for you.  As always, we thank you for your feedback and comments.

Usability News is distributed to over 2500 usability professionals, developers, managers, and researchers in over 59 countries. Contributions and suggestions for future issues should be directed to barbara.chaparro@wichita.edu


SURL Home Page:  www.surl.org
Usability News:  www.usabilitynews.org 
Designing for usability:  www.optimalweb.org 


Reading Online News: A Comparison of Three Presentation Formats

 By Ryan Baker, Michael Bernard, & Shannon Riley

With the ever-increasing progression towards online newsletters as a principal source of information presentation, the Web has offered many opportunities as well as challenges that are unique to this environment. For instance, the traditional newspaper presents information within the confines of evenly-spaced, gridded columns. This has worked quite well in the past, and readers have become very accustomed to this style of information presentation. However, with the advent of the Web, it is now possible to place information in multiple sources that are connected by link titlespermitting online newsletters to initially present only a small amount of pertinent information through the use of these links. That is to say, online newsletters may only need to present links that provide enough information to give the reader a general idea pertaining to that article. This, obviously, might reduce the amount of information clutter that the reader has to initially wade through. Yet, unfortunately, little is known about the most efficient, as well as the most preferred way to present information within this type of medium. Accordingly, this study addressed the question of how information should be presented within a news-style web page. For example, should all the information related to a single article be presented on one page, or should the newsletter contain a page that lists only the link titles that relate to each specific article, and which is presented on another page? Moreover, if the newsletter presents initial information in the form of link titles, should they present supplementary information that provides a general overview of the entire article, along with the link title?

Method

A Pentium II based personal computer, with a 60 Hz, 96dpi 17" monitor with a resolution setting of 1024 x 768 pixels was used. Content for the articles came from the New York Times website (www.nytimes.com). The participants’ performance was tracked by using Ergobrowser™ software.

Participants

Twenty-one participants (5 males, 16 females) volunteered for this study. They ranged in age from 18 to 47, with a mean age of 26 (S.D. = 9 years). The median Web use for the participants was 7-14 hours per week (94% used the Web a few times per week or more).

Procedure  

Users were asked to locate specific information within news articles on three different layouts: full text (Full), link titles plus abstracts (Summary), or link titles only (Links).  Each of the layouts contained information on different domains (sports, health, and science). The Full condition presented twelve full articles on one page (see Figure 1). The Summary provided a short summary of approximately two to three sentences for each of the twelve articles on one page plus a linked title to the full article (see Figure 2). The Links condition provided just the linked titles for the twelve articles on one page (see Figure 3).

Participants searched for information within all three conditions. For each layout, they were presented with ten different search tasks asking them to find specific information (For example, "What type of device is a Large Hadron Collider?") within one of the articles. After finding the information, participants were asked to highlight the information they believed was correct with their cursor. If the answer was verified as correct, participants would perform the next search. Information had to be found within the five-minutes to be considered correct. Participants were allowed to search using the links, as well as the “forward” and “back” buttons until the time expired. The layouts, domains and search terms were all counterbalanced using a Latin square design. The layouts were stored on a local server, allowing instant access to the pages in all conditions.  

full condition

Figure 1. "Full" condition
 

summary condition

Figure 2. "Summary" condition
 

links condition

Figure 3. "Links" condition 

After finishing all the questions for each condition, participants answered a satisfaction questionnaire. The questionnaire consisted of a 6-point Likert scale, with 1 = “Disagree” and 6 = “Agree” as anchors. The questionnaire items were: "The layout made it easy to find information," "This site was visually pleasing", "The arrangement of this site promotes comprehension," "I am satisfied with this site," and, "The layout looks professional." After participants completed the questionnaire for all conditions they ranked the three layouts for general preference.

Results

A within-subject ANOVA design was used to investigate participant performance (Mean task completion time and search accuracy) and perceived ease of use of the three conditions. Preference for all three conditions was analyzed using a Friedman c2.

Task Completion Time

Evaluation of the time (in seconds) taken to complete each of the tasks revealed no significant differences between the three groups [F (2,40) = 1.007, p = .37] (S.D. Full = 326.07, S.D. Summary = 239.86, S.D. Links = 354.23; See Figure 4).

Mean Task Completion Time (in seconds)

Figure 4. Mean Task Completion Time (in seconds)

Perceptions of Site Efficiency

Easy to Find Information

Significant differences were found in the perception that a particular condition was easier to find information [F (2,40) = 4.966, p < .01] . Post hoc analysis indicated that participants perceived the Summary condition as being easier to find information than the Full condition (See Figure 5).

Easy to Find Information  (1 = Disagree and 6 = Agree)

Figure 5. Easy to Find Information (1 = Disagree and 6 = Agree)
 

Arrangement Promotes Comprehension  

Significant differences were also found for the perception that a particular layout promoted comprehension [F (2,40) = 6.321, p < .01], in that participants perceived the Summary condition as being more conducive to comprehension than the Full condition (See Figure 6.).

Arrangement Promotes Comprehension  (1 = Disagree and 6 = Agree)

Figure 6. Arrangement Promotes Comprehension (1 = Disagree and 6 = Agree)
 

Satisfied with Site

Moreover significant difference were found for participant satisfaction between the conditions [F (2,40) = 3.309, p < .05], in that users indicated that they were more satisfied with the Summary site than the Full site (See Figure 7).

Satisfied with Site  (1 = Disagree and 6 = Agree)

Figure 7. Satisfied with Site (1 = Disagree and 6 = Agree)

Looks Professional

Significant differences were also found for the perception that a particular condition looked more professional-looking [F (2,40) = 5.621, p < .05], in that the Summary condition was perceived as more professional-looking than the Full condition (See Figure 8).

Site Looks Professional  (1 = Disagree and 6 = Agree)

Figure 8. Site Looks Professional (1 = Disagree and 6 = Agree)

Layout Preference  

Four participants chose the Full condition as their number one preference. Fifteen participants selected the Summary condition as their first choice, and two participants selected the Links condition as their highest preference (See Figure 9).

Site Preference (participants ranking site as their first choice)

Figure 9. Site Preference (participants ranking site as their first choice)

Discussion  

Overall, there were no statistical differences in search time across the three presentation types. However, the Summary condition was perceived most positively in terms of ease of finding information, being visually pleasing, promoting comprehension, participants' satisfaction with the site, and looking professional. The Summary condition was also the most preferred. The Full condition was the least preferred, and had the most negative perceptions associated with it. The Full condition was perceived as being most difficult to find information, not promoting comprehension, not being visually pleasing, and not being satisfying.

Participants reported that they preferred the Summary condition over the Links only condition because the brief summaries accompanying the headline links often guided them to the information they were searching for. Participants commented that, in the Links condition, they sometimes felt as if they were "jumping blindly" into the article. Several participants also reported that they did not like having to scroll through all of the articles in the Full condition. This study suggests that providing a small amount of information about an article on a page is superior to having long, scrolling pages filled with articles.

Reference

Ergobrowser™, Ergosoft Laboratories © 2001


Examining the Effects of Hypertext Shape on User Performance

By Michael L. Bernard1,2

Studies examining the depth and breadth of hypertext structures have consistently found that increasing its depth correspondingly decreases its search efficiency. This is typically reflected in increased user search time, disorientation, and error, along with reduced satisfaction (e.g., Jacko, & Salvendy, 1996; Kiger, 1984; Larson & Czerwinski, 1998; Snowberry, Parkinson, & Sisson, 1983; Zaphiris, 2000). For example, Snowberry, et al. (1983), who examined four structures that consisted of 64 menu item choices on a single level (64); four menu items per level at a depth of three levels (4 x 4 x 4); eight menu items per level at a depth of two levels (8 x 8); and binary menu items at a depth of six levels (2 x 2 x 2 x 2 x 2 x 2), found that as the degree of hypertext depth increased from one to six levels, the rate of user errors rose from 4.0% to 34%. Likewise, Kiger (1984) found that increasing depth from two to six levels increased its user error rate from 2.2% to 12.5%. Participants also favored the shallowest structure over the deepest ones (Norman, 1990). It has therefore been recommended by numerous researchers that the design emphasis should be placed on reducing the overall depth of a hierarchy by correspondingly increasing the overall degree of menu item breadth.

However, because there is a practical limit to the degree of breadth (as well as depth) within a hypertext, compromises must take place within the design of a structure. To do this, depending on its contents, most hypertexts have several levels of depth with varying degrees of breadth for each level (i.e., having generally expanded breadths at some levels, while having constricted breadths at other levels). Consequently, it is more relevant to hypertext design to concurrently examine the overall shape of a structure by assessing its breadth at each hierarchical level.

Unfortunately, very few studies have examined the ‘shape’ of hypertext structures by varying the amounts of breath over several levels of depth. The most notable exception to this is a study conducted by Norman and Chin (1988). This study examined hypertext shape by assessing five different tree structures of varying breadths, while keeping the depth invariant at four levels. The five structures examined were: a constant breadth (4 x 4 x 4 x 4) structure, comprising of four menu items choices for each level; a decreasing (8 x 8 x 2 x 2) structure, with eight menu items in the first and second levels and two menu items in the third and fourth levels; an increasing (2 x 2 x 8 x 8) structure, with two menu items in the first and second levels and eight menu items in the third and fourth levels; a concave (8 x 2 x 2 x 8) structure, with eight items in the first and fourth levels and two items in the second and third levels; and a convex (2 x 8 x 8 x 2) structure, with two items in the first and fourth levels and eight items in the second and third levels.

Participants were instructed to search a simulated electronic commerce hypertext for either explicitly named items, or items implied in a scenario situation in which they were to find the most appropriate answer—such as to search for a gift item “for a pilot always on the go with time tables to meet.” In searching for explicitly named items, search times were similar across all structures. However, for implicit targets the different shapes did have a significant effect on participants’ search time and search performance. For these targets, participants using the concave (8 x 2 x 2 x 8) structure took less time, searched fewer nodes to find the target items, and used fewer ‘Back’ commands, suggesting a lower degree of disorientation than the other menu tree shapes. For explicit targets, the increasing (2 x 2 x 8 x 8) structure facilitated slightly less navigational disorientation than the other structures.

The convex (2 x 8 x 8 x 2) structure produced the poorest performance for implicit targets. Participants using this structure had slower search time, searched more nodes (web pages) than the other structures, and used the Back command significantly more often than the concave structure.

Yet, as interesting as the Norman and Chin (1988) study is, it did not address the important interplay between hypertext shape and depth. To address this issue, the present study examined six different hypertext shapes at different levels of depth. Moreover, unlike the method used by Norman and Chin, where the number of terminal level nodes remained invariant (256 nodes) across conditions while the total number varied from 294 to 456 nodes, this study sought to have approximately equal number of nodes across hypertext conditions—since smaller structures are generally easier to search than large structures. In the present study, the general questions asked were 1) which factor, hypertext breadth or depth, has the greatest effect on user performance, and 2) which type of shape promotes the greatest user performance when general hypertext size remains relatively constant?

Method

Participants

One hundred and twenty undergraduate students between 18 and 52 (mean = 22) years of age volunteered to participate in this study. Participants reported using the Web at least once per month (96.7 percent reported using the Web a few times per month or more and 79.8 percent reported using the Web two or more hours per week). No significant differences in reported computer usage and anxiety, as well as Web use, were found among participants within the different conditions.

Experimental Task

Participants were assigned to one of six hypertext conditions. Both the search tasks and the hypertext conditions were ordered by means of a Latin square design. Each task required them to search the presented hypertext for a specific merchandise item that would most appropriately satisfy the context of the search task. The number of nodes was approximately equal (varying from 330 nodes across conditions by ten or fewer nodes).

All participants were given the same search tasks (24 total), which resembled typical directed browsing tasks that reflected ‘real-world’ hypertext searches. Each task scenario had only one intended target, which was a terminal node. Similar to Norman and Chin (1988), search tasks were both explicit and implicit in nature (12 explicit and 12 implicit tasks).

The hypertext structures consisted of constant, decreasing, increasing, concave, and variable shapes. The depth and breadth of the hypertext conditions varied from a depth of two to six levels and a breadth of two to 27 menu items per node. The hypertext conditions are as follows: (12 x 27), (11 x 5 x 5), (4 x 4 x 4 x 4), (6 x 2 x 2 x 12), (3 x 2 x 2 x 2 x 12), and (2 x 3 x 2 x 3 x 2 x 3). The structural layout of each hypertext condition is presented in Figure 1.

 
(12 x 27) structure
The (12 x 27) structure. Each terminal node contained 27 menu items. (336 total items).

(11 x 5 x 5) structure
The (11 x 5 x 5) structure. Each terminal node contained 5 menu items (341 total items).

(4 x 4 x 4 x 4) structure
The (4 x 4 x 4 x 4) structure. Each terminal node contained 4 menu items (340 total items).

(6 x 2 x 2 x 12) structure
The (6 x 2 x 2 x 12) structure. Each terminal node contained 12 menu items (330 total items).

(3 x 2 x 2 x 2 x 12) structure
The (3 x 2 x 2 x 2 x 12) structure. Each terminal node contains 12 menu items (333 total items).

(2 x 3 x 2 x 3 x 2 x 3) structure
The (2 x 3 x 2 x 3 x 2 x 3) structure. Each terminal node contained 3 menu items (344 total items).

 

Figure 1. The structural representations of each hypertext condition.

Dependent Variables

The dependent variables were search efficiency and search time (time taken to find the correct information). Search efficiency was measured by examining the number of deviations from the optimal path and the number of total back-page presses or ‘commands’ taken to reach the targeted node. The optimal path is the pre-established, shortest route to a specific targeted node that satisfies a search task. Deviations from this path consist of unintended detours, which indicate navigational disorientation. This was measured by comparing the total number of pages accessed in reaching the target node minus the shortest or ‘ideal’ amount of paging needed to reach this node per search task for each respective hypertext condition (total # of pages accessed – total # of paging needed to acquire relevant nodes).

Materials

A Pentium II based PC computer, using a 60 Hz, 96dpi 17-inch high-resolution RGB monitor with a resolution of 1024 x 768 pixels was used. The computer operating system used was Microsoft’s Windows XP. The website was saved locally in order to insure equal download time for all searches.

Procedure

Participants were assigned to one of six hypertext conditions. Each participant was then given, one at a time, randomly assigned explicit and implicit search tasks. They then searched the hypertext until they found the most appropriate answer to the task statement. To search, participants could use the menu item links located on each respective page, or use the ‘Forward’ or ‘Back’ page button located on the browser’s menu bar. They could also select a ‘homepage’ link that was located at the top-left side of the screen in order to return to the parent page at any time during the search. If a participant selected an item that was not designated as the target, he or she would be informed that the item was ‘incorrect’ and instructed to search again for an item that best satisfies the search task. For each task, participants searched until they found the correct task information, or until the allotted time (5 minutes) expired.

Results

Comparison of Explicit and Implicit Task Types

A 2 x 3 MANOVA was used to examine the implicit and explicit task scores for search time and search efficiency (deviations from optimal path and back-page commands). The results revealed significant differences in performance between two task types in that participants searching with explicit tasks had faster search times and had a higher search performance than when searching with implicit tasks [F (1, 113) = 47.05, p < .001], which is consistent with results of Norman and Chin (1988). The task types did not, however, significantly interact with the three conditions (p = .35). Therefore the three hypertext conditions were examined across both implicit and explicit tasks.

Comparing Hypertext Conditions

Analysis of participants’ search efficiency indicated a significant difference between hypertext conditions for deviations from the optimal navigational path and total amount of back-page commands, respectively [F (5, 114) = 30.08, p < .001; F (5, 114) = 48.61, p < .001], as well as for search time [F (5, 114) = 30.08, p < .001]. Post hoc analysis revealed that the (12 x 27) and (11 x 5 x 5) conditions had significantly fewer deviations from the optimal navigational path and less back-page usage than all but the (6 x 2 x 2 x 12) structure. The (2 x 3 x 2 x 3 x 2 x 3) condition had significantly greater navigational deviations than all other conditions (See Figures 2 and 3).

Analysis of search time also revealed that both the (12 x 27) and the (11 x 5 x 5) conditions had significantly faster search than all but the (6 x 2 x 2 x 12) structure. The (2 x 3 x 2 x 3 x 2 x 3) condition had significantly longer search time than all other conditions (see Figure 4). Together, the measurements of search efficiency and search times suggest that the (12 x 27) and the (11 x 5 x 5) conditions were more informationally accessible than all but the (6 x 2 x 2 x 12) structure.

deviations from optimal task

Figure 2. Deviations from optimal path (per search task)
 

number of back-page commands

Figure 3.  Number of back-page commands (per search task)
 

search time

Figure 4.  Search time (per search task)

Discussion

Predicting search efficiency 

The results of this study paint a more complex picture of hypertext performance than has been previously observed. That is to say, with regard to hypertext structure, depth alone may not be the sole, or even the greatest determinate in predicting search performance. In fact, as this study has shown, the shape of a hypertext structure had at least as much to do with search efficiency than its depth. Indeed, the (4 x 4 x 4 x 4) structure was found to be not only less efficient than hypertext shapes of the same depth (i.e., the (6 x 2 x 2 x 12) structure), but structures that were deeper, such as the (3 x 2 x 2 x 2 x 12) structure. As discussed, much has been said about hypertext depth, in that the greater the depth, the less informationally efficient the structure should be (e.g., Jacko & Salvendy, 1996; Snowberry, et al., 1983). However, what seems to be occurring in this study is that the participants’ search efficiency is at least in part, determined by the properties related to the overall shape of the hypertext structure. These properties, then, act to either help facilitate or impede hypertext efficiency by altering the general complexity of the structure. Accordingly, having an inefficient shape will decrease a hypertext’s search efficiency.

The results of this study also support the findings of Jacko and Salvendy (1996), Snowberry et al. (1983), and others, which state that broader and shallower hypertext structures are, on a whole, generally superior in search efficiency than narrower and deeper structures. Namely, broad categorical groupings, which was seen with the (12 x 27) condition, and to a lesser extent the (11 x 5 x 5) condition, facilitated participant navigation that had fewer deviations from the optimum path and back-page commands, as well as faster search times than the more narrow and deeper structures.

Hypertext Shape Efficiency

Moreover, together with the findings of Norman and Chin (1988), it is asserted that concave shapes (i.e., (6 x 2 x 2 x 12)) are more navigationally efficient than relatively constant shapes (i.e., (4 x 4 x 4 x 4)) of the same size and depth. Norman and Chin argue that the concave structure is an optimal design because having a larger percentage—and thus more defined—descriptor items at the beginning of a search helps the user form a more exact match between the concept related to the target item and the actual target item itself. At the terminal level, broad menus reduce the overall information uncertainty, since at this level the target items are more explicitly defined. This notion is also associated with the concept of information scent. That is, the more explicit the association between the initial, descriptor items and the targeted, terminal items, the greater the scent. For this reason, a search at the terminal level should benefit from a maximum amount of scent.

Conversely, nodes in the middle level of the tree structure are, in part, designed to ultimately direct the user to the terminal level to access the targeted item. It is argued by Norman and Chin (1988) that the breadth of items at this level should be constricted, and limited to only directing the user to the appropriate target item. Thus, it is maintained that broad menus at the middle level will only increase the likelihood of users’ choosing the wrong path to the target item. This was found to be also true in the present study.

Implications for Website Design

Typically when individuals evaluate the information accessibility of a website, it mostly involves the appraisal of the site’s physical interface. However, as discussed above, its general shape can play just as, or even more important role in its overall information accessibility—especially for large structures. Furthermore, assessing solely its depth may not provide a truly accurate prediction as to its accessibility. That is to say, an inefficient shape (i.e., a structure with a large percentage of menu choices within the center of the structure) with a relatively shallow depth might be less informationally accessible than a more efficient shape at greater depths. From these results, as well as Norman and Chin’s (1988) study, it is further suggested that websites which have several or more levels of depth attempt to give the user the greatest number of choices at both the top and terminal levels of the site, while constricting the choices between these levels.

1Note: This research is part of a larger dissertation study that examined a metric for predicting the accessibility of information within hypertext structures

References

Jacko, J., A., & Salvendy, G. (1996). Hierarchical menu design: breadth, depth, and task complexity. Perceptual and Motor Skills, 82, 1187-1201.

Kiger, J. I. (1984). The depth/breadth tradeoff in the design of menu-driven interfaces. International Journal of Man-Machine Studies, 20, 201-213.

Larson, K. & Czerwinski, M. (1998). Web page design: Implications of memory, structure and scent from information retrieval. Proceedings of the Association for Computing Machinery’s CHI ’98, 18-23.

Norman, K., L. (1990). The psychology of menu selection: Designing cognitive control of the human/computer interface. Norwood, NJ: Ablex Publishing co.

Norman, K., L. & Chin, J. P. (1988). The effect of tree structure on search performance in a hierarchical menu selection system. Behaviour and Information Technology, 7, 51-65.

Snowberry, K., Parkinson, S., & Sisson, N. (1983). Computer display menus. Ergonomics, 26, 699-712.

Zaphiris, P. (2000). Depth Vs Breadth in the Arrangement of Web Links, Proceedings of the 44th Annual Meeting of the Human Factors and Ergonomics Society, 139-144.

2Note: Michael Bernard is currently a post-doctorate fellow at Sandia National Laboratories


Determining Cognitive Predictors of User Performance within
Complex User Interfaces


By Michael L. Bernard1, Chris Hamblin, & Brett Scofield

It is quite apparent that the computer is now a ubiquitous tool for both home and work. It is also evident that computer interfaces have, in general, acquired more functions and present more information than at any other time in the past. For example, it is now common for users to have layouts with multiple and distinct information sources that are concurrently located on an interface. In light of this, the need to study the interaction between complex interfaces and the cognitive factors that affect user performance has become paramount. Unfortunately, most of the studies that have measured human performance with regard to interfaces were done prior to the advent of modern graphical user interfaces.

For instance, Vincente, Hayes, and Williges (1987) examined participants’ performance while they searched for information within a hierarchical file system. The file system consisted of three levels, with a total of 15 files. Vincente et al. found that performance was affected by several cognitive variables that were independent of computer experience. These variables consisted of spatial ability, which generally predicted search performance, and verbal ability, which predicted their performance when reading was required. In fact, participants with low spatial ability took twice as long to find information than those with high spatial ability. However in the Vincente et al. study, the interface presented only a small number of functions and, thus, it is argued here that the interface used in the study does not reflect the type of complex user interface that is seen today.

The term ‘complex’ in this instance refers to the extent to which an interface adheres to a predictable visual scheme, such that the less predictable the interface layout, the more complex it should be (Tullis, 1983). Another factor that often adds to layout complexity is the overall density of the information displayed. That is to say, the more bits of information that are presented beyond a moderate amount, the more complex the interface is thought to be (Vitz, 1966). Since complex interfaces require users to exert higher amounts of cognitive effort, it is suggested that understanding which cognitive factors most affect users’ actual and perceived performance should help designers create interfaces that conform more closely to the cognitive aptitude of individuals.

This study sought to assess the full extent of intellectual functioning across participants by administering the Wechsler Abbreviated Scale of Intelligence (WASI), along with administering a mirror-tracing task to assess their perceptual-motor skills. The participants’ scores on the individual factors of intelligence and perceptual-motor skills were then examined in relation to their search performance on a complex website interface.

Method

Participants

Twenty-two undergraduate students between 18 and 34 (mean = 22) years of age volunteered to participate in this study. All participants reported using the Web at least once per month (87 percent reported using the Web a few times per week or more). Two of the participants reported visiting a financial/brokerage website a few times per month and two reported visiting a financial/brokerage website less than once per month.

Experimental Task

Participants were presented with four interfaces that had approximately the same amount of layout complexity. This was accomplished by creating layouts with multiple information sources that were densely displayed (see Figure 1 for an example of one of the interfaces). Four interfaces were chosen rather than one in order to present as much information as possible without the need for participants to scroll in order to view the entire interface layout. All of the interfaces were as presented as an online brokerage portal.

Participants were given 20 search tasks and were required to find specific information pertaining to brokerage information that would most appropriately satisfy the context of the search task. For example, one question asked, “What is the pre-market percent change for Adobe Systems Inc.?” The tasks were designed to be moderately difficult. Yet, since the tasks only required participants to find but not interpret this information, they presented the same degree of difficultly for those with or without website brokerage experience. Also, all task information was presented without the need to search lower than one level of depth. The search tasks and the searching order within the four layouts were randomized by means of a Latin square design.

example layout for one of the presented interfaces

Figure 1.  An example of the layout for one of the presented interfaces
 

Materials

A Pentium II based PC computer, using a 60 Hz, 96dpi 17-inch high-resolution RGB monitor with a resolution of 1024 x 768 pixels was used. The computer operating system used was Microsoft’s Windows XP. The format of the text was presented as an HTML web page. Search time was recorded by means of the software tool, Ergobrowser™, which served as the Web browser.  

Dependent Measures

The WASI Instrument

The instrument used to measure the participants’ intellectual abilities was the Wechsler Abbreviated Scale of Intelligence (WASI) instrument (Wechsler, 1999). The WASI is similar in format and highly correlated with the Wechsler Adult Intelligence Scales (Goebel & Satz, 1975; Tulsky & Zhu, 2000). The WASI instrument was used because it is a standardized, normed, and validated short form of the Wechsler Adult Intelligence Scale. It also provided a reliable and valid estimate of verbal, performance, and general intellectual functioning (Kaufman & Kaufman, 2001).

The WASI instrument measures several facets of intelligence, such as verbal knowledge, visual information processing, spatial and nonverbal reasoning, and crystallized and fluid intelligence. The WASI instrument consists of four subtests - Vocabulary, Block Design, Similarities, and Matrix Reasoning (Wechsler, 1999):

  1. Vocabulary measures expressive vocabulary, verbal knowledge, and fund of information. It is also a good measure of crystallized intelligence, as well as general intelligence. For this subtest, participants were required to name pictures and define words that are orally and visually presented.
     

  2. Block Design measures spatial visualization, visual-motor coordination, abstract conceptualization, and perceptual organization by requiring participants to replicate modeled or printed two-dimensional geometric patterns within a specified time by using two-color cube patterns.
     

  3. Similarities measures verbal concept formation, abstract reasoning ability, and general intellectual ability. For this measure, participants were presented four picture items and 22 verbal items. For the picture items, participants were shown a picture of three objects on the top row and four response options on the bottom row. Participants responded by indicating which response item was similar to the three target objects. For the verbal items, a pair of words was presented orally and participants explain the similarity between the object or concept that the two words represented.
     

  4. Matrix Reasoning measures nonverbal fluid reasoning by requiring participants to complete a missing portion of an abstract, gridded pattern by indicating the correct completed pattern from five possible choices.   

These four subtests combine to form the following general scores:

a)   Performance IQ = block design + matrix reasoning

b)   Verbal IQ = vocabulary + similarities

c)   Full Scale IQ = block design + matrix reasoning + vocabulary + similarities

 
The Mirror-tracing Task

In addition to the WASI assessment, a mirror-tracing task was used to measure participants’ perceptual-motor skills. For this measure participants were required to draw a line within and parallel to two printed circles that were 4mm apart. The diameter of the largest printed circle was approximately 180 mm. Participants could not directly view the circles or their hand when drawing the line. Instead, they viewed them through a mirror that reversed the image. Line completion times, which measured time taken to draw the circle, and drawing accuracy, which consisted of the number of time participants drew outside the printed circles, were measured.

Procedure

Participants first performed the mirror-tracing task. They then answered a computer/web experience and computer comfort questionnaire. After completing the questionnaire, participants were instructed to examine the contents of the four layout interfaces until they were familiar with its general layout. This lasted for approximately one minute per interface. Then participants were then given a practice question for each layout. When the practice session was completed, participants were given 20 search questions, five per layout. If a participant selected an item that was not designated as the target, he or she would be informed that the item was ‘incorrect’ and instructed to search again for an item that best satisfied the search task. For each task, participants searched until they found the correct task information, or until the allotted time (5 minutes) expired. After participants finished with the search tasks for all of the interfaces, they answered a perception of disorientation questionnaire. The question items were as follows: "It was difficult to find information on the computer screen," the amount of information presented was about right," and "The placement of information was disorientating."  The questionnaire consisted of a 7-point Likert scale, with 1 = “Not at all” and 7 = “Completely” as anchors. Since the question items were significantly correlated with each other (p < .01), a mean score was used. Participants were then given the WASI intelligence instrument, which took approximately 45-50 minutes to administer, on a subsequent day.

Results

In order to compare search time and perception of disorientation scores to the WASI/Mirror-tracing scores, both the search time and perception of disorientation scores were equally divided into three separate categories, consisting of a fast, medium, and slow for reading speed (see Table 1) and low, medium, and high for the perception of disorientation (see Table 2).

Table 1. Search time categories

 Fastest

 Medium

 Slowest

 < 490.4 sec

   490.4 > < 646.4 sec

  > 646.4 sec<>

 
Table 2.
Perception of disorientation categories

 Low

 Medium

 High

 < 3.67

 3.67 > < 5.00

 > 5.00

 
The three categories served as the independent variables, whereas the WASI and mirror-tracing scores served as the dependent variables. A 3 x 9 MANOVA was used to compare the three levels of search time and perceived disorientation with the dependent scores.

Analysis of the cognitive aptitudes for the three search time categories revealed significant differences between the three search time categories and Block Design, Performance IQ, Full Scale IQ, and Mirror-tracing Accuracy scores [F (7,14)= 3.50, p < .05; see Table 3 below]. Post hoc analysis using the Tukey HSD method indicated that participants with the slowest search time category had significantly lower cognitive aptitude scores than either the top or middle search time categories. The same was essentially true for the Matrix reasoning scores; however, the differences only approached significance. Correlations between search time across all three levels of aptitude and the dependent measures are also presented below. As shown in Table 3, the Block Design subtest correlated the highest with search time.

Search Time

Table 3.  Search time and cognitive aptitude

 Tasks

Correlation

Significance

 Vocabulary 

-0.25

p = .55

 Block Design

-0.69

p < .01

 Similarities

-0.16

p = .34

 Matrix Reasoning

-0.32

p = .07

 Verbal IQ

-0.29

p = .28

 Performance IQ

-0.55

p < .01

 Full Scale IQ

-0.51

p < .05

 Mirror-tracing Accuracy

 0.56

p < .01

 Mirror-tracing Time

 0.46

p = .17

 
Perception of Disorientation

Analysis of participants’ perception of disorientation revealed significant differences between the three levels of perceived disorientation and the Vocabulary and Verbal IQ subtest scores [F (7,15)= 3.12, p < .05; see Table 4 below]. Post hoc analysis indicated that participants with the lowest level of perceived disorientation had significantly higher Vocabulary scores than those with high levels of perceived disorientation. Similar results were found for Full Scale IQ, but these results only approached significance. In addition, participants with the lowest level of perceived disorientation had significantly higher Verbal IQ scores than those with either the middle or highest levels of perceived disorientation. Interestingly, the perception of disorientation was not significantly correlated with search time (p = .61; r = 0.12). The correlations between perceived disorientation across all three levels and the dependent measures are also presented below. As shown in Table 4, the Verbal IQ subtest correlated the highest with perceived disorientation.

Table 4. Perceived disorientation and cognitive aptitude

 Tasks

Correlation

Significance

 Vocabulary 

 0.46

p < .01

 Block Design

 0.00

p = .36

 Similarities

 0.36

p = .29

 Matrix Reasoning

-0.12

p = .17

 Verbal IQ

 0.52

p < .05

 Performance IQ

 0.00

p = .16

 Full Scale IQ

 0.25

p = .08

 Mirror-tracing Accuracy

-0.15

p = .32

 Mirror-tracing Time

-0.06

p = .74

  
Computer/web experience and computer comfort

Assessing participants’ computer and web experience, as well as their comfort with using computers with regard to their search time and perceived perception, not surprisingly, did correlate significantly. Specifically, the participants’ level of comfort with the Internet was significantly correlated with perceived disorientation (p < .05; r = -0.49). Moreover, participants’ indications of frequent web visits significantly correlated with their search time in that frequent web users generally had faster search times (p < .05; r = -0.51).

Discussion

This study has shown that psychometric tests of cognitive abilities can generally predict search performance for complex interfaces, in that certain cognitive factors do significantly correspond to search time performance and perceived disorientation when searching within a complex interface. When considering search time, the subtest factors that most determined participant performance were Block Design, Performance IQ, and Mirror-tracing Accuracy. All of these subtests generally tap into three cognitive functions: spatial visualization, visual-motor perception and coordination, and fluid reasoning. Thus, it is proposed that these cognitive/motor functions play a substantial role in determining search performance within complex interfaces.

The Vocabulary subtest did not significantly contribute to search performance. It is certainly possible that the Vocabulary subtest, which generally measures verbal knowledge, was not as important to performance because the task involved mostly searching for information. The Similarities subtest, which generally measures intellectual ability, also did not significantly contribute to search performance, possibly for the same reason as above.

When assessing perceived disorientation, only the Vocabulary and Verbal IQ subtest factors significantly contributed to participants’ perceived disorientation. Not surprisingly, both of these subtests have a common thread, in that they both measure intellectual ability. It is interesting that perceived disorientation was mostly affected by intellectual factors, rather than spatial factors. It is possible that complex interfaces burden the intellectual capacity of users, which is translated by means of higher correlations with the disorientation scores. Yet, apparently this burden is not great enough to affect the search time of the users.

Implications for Interface Design

It is very common these days to encounter user interfaces that contain multiple information sources that are densely displayed—such as with online travel and brokerage sites. When creating these types of interfaces, designers should take into consideration our cognitive and motor limitations. From these results certain design recommendations can be suggested. Specifically, layouts should be designed to reduce the cognitive burden associated with spatial visualization and visual-motor coordination. To help do this designers should focus their efforts on creating interfaces that appropriately group information by function (Dodson & Shields, 1978) and reduce overall information density to less than 50 percent of the screen area (see Horton, 1989 for a discussion of the empirical studies related to this).

References

Dodson, D. W., & Shields, N. L. (1987). Development of user guidelines for ECAS display design (Vol. 1) (Report No. NASA-CR-150877). Huntsville , AL : Essex Corp.

Goebel, R. A., & Satz, P. (1975). Profile analysis and the Abbreviated Wechsler Adult Intelligence Scale: A multivariate approach. Journal of Consulting and Clinical Psychology, 43, 780-785.

Horton, W. K. (1989). Designing and writing online documentation. New York : John Wiley & Sons.

Kaufman, J. C., & Kaufman, A. S. (2001). Time for the changing of the guard: A farewell to short forms of intelligence tests. Journal of Psychoeducational Assessment, 19, 245-267.

Tullis, T. S. (1983). The formatting of alphanumeric displays: A review and analysis. Human Factors, 25, 657-683.

Tulsky, D. S., & Zhu, J. (2000). Could test length or order affect scores on Letter Number Sequencing of the WAIS-III and WMS-III? Ruling out effects of fatigue. Clinical Neuropsychologist, 14, 474-478.

Vicente, K. J., Hayes, B. C., & Williges, R. C. (1987). Assaying and isolating individual differences in searching a hierarchical file system. Human Factors, 29, 349-359.

Vitz, P. C. (1966). Preference for different amounts of visual complexity. Behavioral Science, 11, 105-114.

Wechsler, D. (1999). Wechsler Abbreviated Scale of Intelligence™ (WASI™). Wechsler Abbreviated™. San Antonio, TX: The Psychological Corporation.

1Note: Michael Bernard is currently a post-doctorate fellow at Sandia National Laboratories


The Effects of Line Length on Children and Adults’
Online Reading Performance

 By Michael Bernard1, Marissa Fernandez, & Spring Hull

Adults, as well as children these days often read an extensive amount of information online. For example, of the 25-to-34-years-old age group, it is reported that 25 percent read online newspapers, compared to only 19 percent who read from printed newspapers (The Digital Edge, 2000). Even young children are now spending progressively more time reading online documents, including being tested online in schools. Thus, the need to address the ergonomic issues associated with this type of medium has become even more important. As discussed in previous editions of Usability News, certain textual factors can affect user performance and preference when reading online text. The purpose of this study was to examine the effects of line length on online reading performance by both adults and children. Unfortunately, little research has been conducted investigating line length and online reading with respect to both actual and perceived reading efficiency, as well as preference; and, to date, no research has included children in its investigation.

Studies investigating line lengths have thus far produced mixed results. For example, Dyson and Kipping (1998) found that longer lines (approximately 75-100 characters per line or CPL) were read faster than very narrow ones (25 CPL), with no difference in perception of reading efficiency. Moreover, Duchnicky and Kolers (1983) found that full-screen (187 mm) line lengths resulted in 28 percent faster reading times over 1/3 screen (62 mm) line lengths. In addition, the full and 2/3 screen (125 mm) line lengths were read significantly faster than the 1/3 screen line lengths. Duchnicky and Kolers concluded that longer line lengths are read more efficiently from computer screens than narrower ones.

Yet, conclusions have mostly favored short to medium line lengths. For example, it has been recommended by researchers that shorter line lengths (about 60 CPL) should be used in place of longer, full-screen lengths, since longer line lengths require greater lateral eye movements, which makes it more likely to lose one's place within the text (Horton, 1989; Mills & Weldon, 1987). Horton (1989) points out that longer line lengths are more tiring to read and recommends limiting line lengths to around 40 to 60 CPL. Huey (1968) generally supports this recommendation by finding that narrower line lengths (approximately 4" or 10 cm) are more accurate on the return sweep than longer line lengths. Gregory and Poulton (1970) maintain that people with poor reading ability performed better when the line length was approximately seven words. This suggests that young readers who have not mastered online reading, as well as readers who have vision deficits, may benefit the most from narrower line lengths. 

Moreover, Youngman and Scharff (1999) found that with 0.5-inch (12.5 cm) margins, the fastest reaction times were for the shorter, 4-inch (10 cm) lengths over the 6- and 8-inch lengths (15 and 20 cm, respectively). The 4-inch lengths were also preferred over the other lengths. With no margin lengths, the 8-inch line lengths had the fastest overall reaction times. Similarly, a recent study by Dyson and Haselgrove (2001) found that medium line lengths (55 CPL, which is approximately 4-inches) facilitated more effective reading at normal reading speed than shorter line lengths (24 CPL).

Method

Participants

Forty participants (20 adults and 20 children) volunteered for this study. The adults ranged in age from 18 to 61, with a mean age of 29 (S.D. = 12 years) and the children ranged in age from 9 to 12, with a mean age of 11 (S.D. = 1 year) and attended 4th, 5th, or 6th grade. All adults reported reading text on computer screens a few times per week or more. Seventy-five percent of the children reported reading text on computer screens a few times per month or more. The children received $5.00 for participating in this experiment. All participants had 20/40 or better unaided or corrected vision as tested by a Snellen near acuity chart.

Materials

A Pentium II based personal computer, with a 60 Hz, 96dpi 17" monitor with a resolution setting of 1024 x 768 pixels was used.

The passages consisted of three line-length conditions. These conditions consisted of passages that had a line length that spread the full distance of the screen, which was 930 pixels (245 mm, 132 CPL; see Figure 1) wide; passages that had a line length of 550 pixels (approximately 145 mm, 76 CPL; see Figure 2); and passages that had a line length of 330 pixels (approximately 85 mm, 45 CPL; see Figure 3). As with typical online passages, the narrower the passage, the more scrolling was required to view the entire passage.
 

Full-width example

Figure 1. Full-length example

 

Medium-width example

Figure 2. Medium-length example

Narrow-width example

Figure 3. Narrow-length example

 
Task Design

Line conditions were compared by having participants read three passages, each with different line lengths. The conditions were counterbalanced by means of a Latin square design. Both the adults' and children’s passages were 12-point Arial, which was black on a white background.

The adults read passages from Microsoft's electronic library, Encarta™, which were written at approximately the same reading level and discussed similar material (all dealt with psychology-related topics). The passages were adjusted to have approximately the same length (an average of 1028 words per passage, S.D. of 18 words)

The children’s passages were short children’s stories drawn from Whootie Owl's Fairytales™, which were written at the 4th and 5th grade reading level. The passages were adjusted to have approximately the same length (an average of 573 words per passage, S.D. of 13 words)

Procedure

Participants were positioned at a distance of approximately 57 cm from the computer screen. They were then asked to read “as quickly and as accurately as possible” the passages, which contained 15 randomly placed substitution words for the adults and 10 for the children (they were not told the number of substitution words). The substitution words were designed to be clearly seen as inappropriate for the context of the passages when read carefully. These words varied grammatically from the original words—for example the noun “cake” being replaced with the adjective “fake.” The participants were instructed to identify these words by stating the substituted words aloud. This was designed to insure that participants actually read the passages, instead of just skimming over them.

To accurately determine font readability and its associated effect on reading time, an effective reading score was used. The score was derived from obtaining the time taken to read the passages divided by the percentage of accurately detected substituted words in the passages—which was registered by a stopwatch.

After reading each passage, participants answered a perception of readability questionnaire. The questionnaire consisted of a 6-point Likert scale with 1 = “Not at all” and 6 = “Completely” as anchors. The questionnaires consisted of statements regarding their ease of reading for each line length condition. When all questionnaires were completed, they ranked the three line length condition for general preference.

Results and Discussion

A within-subjects ANOVA design was used to analyze objective and subjective differences between the line lengths. Post hoc comparisons were done using the Bonferroni test. Ranked font preference was measured by means of a Friedman χ2.

Reading Time and Effective Reading

Examining the mean reading time for each line length surprisingly found no significant differences for both children and adults [p = .40; p = .88, respectively]. It is possible that the benefits of reduced scrolling for the wider condition was offset by its increased line length and, thus, negating any positive effects due to the decrease in its line length. The means and standard deviations for both adults and children for the three conditions are presented in Table 1. Examining the effective reading score (reading time/reading accuracy) also revealed no significant differences in reading time/accuracy between the three line lengths for both children and adults [p = .10; p = .60, respectively, see Table 2].

Table 1. Means and Standard Deviations for reading time

 Means (SD)

 Full-length

 Medium-length

 Narrow-length

 Adults

 370 (107) sec

 363 (103) sec

 266 (109) sec

 Children

 276 (76) sec

 279 (68) sec

 266 (68) sec

 
Table 2. Means and Standard Deviations for effective reading score
 

 Means (SD)

 Full-length

 Medium-length

 Narrow-length

 Adults

 425 (138) sec

 463 (211) sec

 443 (189) sec

 Children

 362 (102) sec

 359 (66) sec

 330 (94) sec

 
Adults' Perception of Reading Efficiency

Accessing adults' perception that the amount of scrolling was optimal for a particular line length condition found significant differences [F (2, 38) = 6.70, p < .01]. Post hoc analysis revealed that the Full-length condition was perceived as being more optimal than both the Medium- and Narrow-length conditions (see Figure 4).

The amount of scrolling was optimal (1 = Not at all, 6 = Completely)

Figure 4.  The amount of scrolling was optimal (1 = Not at all, 6 = Completely)

 
Accessing adults' perception that the length was optimal for a particular line length condition revealed no significant differences [p = .19]. Accessing adults' satisfaction with the ease of concentration for a particular line length condition revealed significant differences [F (2, 38) = 5.41, p < .01] in that the Narrow-length was perceived as promoting easier concentration than both the Medium- and Full-length conditions (see Figure 5).

Ability to concentrate on the passages (1 = Not at all, 6 = Completely)

Figure 5.  Ability to concentrate on the passages (1 = Not at all, 6 = Completely)

 
Accessing adults' perception that the layout was optimally presented for a particular line length condition revealed significant differences [F (2, 38) = 6.26, p < .01] in that the Medium-length was perceived as promoting easier concentration than the Full-length condition (see Figure 6).

The layout was optimally presented (1 = Not at all, 6 = Completely)

Figure 6.  The layout was optimally presented (1 = Not at all, 6 = Completely)

 
Children's Perception of Reading Efficiency

Assessing children's perception that the amount of scrolling was optimal for a particular line length condition found no significant differences between any of the line length conditions [p = .39]. No significant differences were also found for the perception that a particular line length was preferable or, a particular length was perceived as promoting easier concentration [p = . 61, p = .33, respectively].

General Preference

Examining the first-choice preference indicated that the Medium-length condition was most preferred by adults (Figure 7) and the Narrow-length condition was most preferred by children (Figure 8). The Full-length condition was the least preferred by both adults and children.
 

Adults' line length 1st choice

Figure 7.  Adults' line length 1st choice
(no adult chose the Full length condition
as their 1st choice)

Children's' line length 1st choice

Figure 8.  Children's' line length 1st choice
 
  

 
Conclusion

This study found no significant differences in reading time or reading efficiency between the three line length conditions for both the adults and children. However, the results did support the finding that shorter line lengths are preferred more than full-screen line lengths. As far as the perception of reading efficiency, the results were mixed. For adults, the Full-length condition was perceived as providing the optimal amount of scrolling in comparison to the two other conditions—presumably because this condition required the least amount of scrolling. The Narrow-length condition was perceived as promoting the highest amount of concentration, while the Medium-length condition was considered to be the most optimally presented length for reading.

When examining children's perceptions of reading efficiency for each of the line lengths, no significant differences were found. It is possible this is due to the fact that children at this age-range are still not fully skilled at reading and, thus, are concentrating more on simply reading the passages than on any reading efficiency differences in line length.

From this study, as well as the studies mentioned above, it is suggested that full-screen line length should be avoided for online documents, especially if a large amount of text is presented. For adults, it is suggested that medium line lengths should be presented (approximately 65 to 75 CPL). Children, on the other hand, indicated their preference for the narrowest line length (45 CPL) and, thus, it may be beneficial to use narrow line lengths when possible.

References

Duchnicky, J. L., & Kolers, P. A. (1983). Readability of text scrolled on visual display terminals as a function of window size. Human Factors, 25, 683-692.

Dyson, M. C., & Haselgrove, M. (2001). The influence of reading speed and line length on the effectiveness of reading from screen. International Journal of Human-Computer Studies, 54, 585-612.

Dyson, M. C., & Kipping, G. J. (1998). The effects of line length and method of movement on patterns of reading from screen. Visible Language, 32, 150-181.

Gregory, M., & Poulton, E. C. (1970). Even versus uneven right-hand margins and the rate of comprehension reading. Ergonomics, 13, 427-434

Horton, W. (1989). Designing and writing online documentation: Help files to hypertext. John Wiley & Sons: New York.

Huey, E. B. (1968). The psychology and pedagogy of reading. Cambridge, MA: MIT Press.

Mills, C. B. & Weldon, L., J. (1987). Reading text from computer screens. ACM Computing Surveys, 4, 329-358.

The Digital Edge. (2000). Print and online components bring strength to newspapers. Newspaper Association of America. Retrieved 7/13/02: http://www.digitaledge.org/monthly/2000_01/synergize.html

Whootie Owl's Fairytales® Whootie Owl Productions, LLC. Retrieved 7/13/02: http://www.storiestogrowby.com/

Youngman, M., & Scharff, L. (1998). Text length and margin length influences on readability of GUIs. Southwest Psychological Association. Retrieved 7/13/02: http://hubel.sfasu.edu/research/textmargin.html


Fortune 500 Revisited: Current Trends in Sitemap Design

 By Mark C. Russell

In our first issue of Usability News, we reported a study that surveyed the top Fortune 500 companies’ web sites for the use of sitemaps (Bernard, 1999a). This study found that nearly half (46%) of the web sites did not have a sitemap of any kind. Of the half that did have a sitemap, 89% used a hierarchical textual representation and 11% displayed a graphical depiction of the site. Now that a few years have passed, we thought it might be interesting to review those sites again (or rather the current list of Fortune 500 companies) and see how things have changed.

We wondered, for instance, what the current trends are for sitemap design. Are most sitemaps just categorical lists or are they arranged hierarchically according to topic in an attempt to give the user some type of structural model of the site? Or could they actually be getting more graphical in nature, something akin to an actual map? Nielsen (2002) reported that less than 50% of the sites he surveyed actually had sitemaps, so it was also equally possible that web designers are beginning to move away from providing this option on their site.

What might the rationale be for doing away with sitemaps? Researchers propose that an explicit map of a site’s structure should allow visitors to navigate more efficiently through a site (Kim and Hirtle, 1995;Billingsly, 1982; McDonald and Stevenson, 1998). However, there are findings to the contrary, such as Farris, Jones, and Elgin (2001), who caution that a sitemap’s utility is lessened if it does not reflect the user’s (or domain’s) conceptual structure or mental model (see also: Dias, and Sousa, 1997; Stanton, Taylor, & Tweedie, L. A., 1992).

Setting the actual merits of providing sitemaps aside for the moment, we reviewed the 2002 list of Fortune 500 companies and categorized what we found. One problem we initially encountered was in the categorization process itself. There is little conformity with regard to the design of sitemaps and so we sought to capture the differences between designs. Table 1 shows the operational definitions we used for the sitemap categories.

Table 1: Operational definitions for sitemap categories

 

Category

Definition

Click on thumbnails for larger version

None

No labeled or identifiable map

Categorical sitemap

Categorical

Simple list of topics with category titles

Extended Categorical sitemap

Extended Categorical

List of topics 3 or more levels “deep”

Hierarchical sitemap

Hierarchical

Categorical lists with lines or boxes showing some navigational relationship between groupings

Organization-type Chart sitemap

Graphical

More formal uses of shapes and lines, such as in an organizational chart, usually less text

Alphabetical Index sitemap

Alphabetical Index

Simple A-Z listing of topics

 
Method

Five hundred web sites from the Fortune 500 web site (www.fortune.com) were examined. Most sitemaps were found: (1) above or in close proximity to the navigation bar; (2) at the bottom of the page with a collection of other links; or (3) listed somewhere in the navigation bar options.

One aspect of sitemap design that was not accounted for in our “taxonomy” of map types was the implementation of restricted or expanding menus. Some sitemaps are presented such that additional information about that area of the site is available by a click or a mouse-over on that category, link, or area of the map (for more see: Bernard, 1999b). However, we did keep track of this aspect of the map design separate of the categorization, as well as noting what the sitemap was called (e.g. map, index, guide, directory, etc.).

In addition, we tracked other features, such as: (1) whether there was a search engine available on the site; (2) how the search engine was available from the sitemap; and (3) whether the sitemap was a "stand-alone" item, or displayed along with the site’s navigation bar.

Results

Types of Sitemaps

Out of the 500 websites, 497 were searchable for a sitemap. Of the three non-represented companies, one had no web site, one had a site that would not load on any one of multiple occasions, and one was 'dismantling' due to bankruptcy. Of those 497 sites, 295 (59%) of them had some form of sitemap. This represents an increase from our survey three years ago (54%). Table 2 shows the breakdown of sitemap types.  

Table 2: Types of Sitemaps

Type

Frequency

Percentage

categorical (2 levels only)

133

45.1

categorical (3+ levels)

145

49.2

hierarchical

8

2.7

graphical

6

2.0

alphabetical index

3

1.0

Total

295

100

Clearly the categorical style was the most popular. The hierarchical and graphical method of designing of a sitemap were found less frequently (less than 5% of the time). Interestingly, there were no graphical sitemaps found that used anything more elaborate than basic shapes. For example, we thought some sites might make use of symbols, cartoons or pictures to represent the site content. The closest thing we found to these possibilities were actually home pages with animated graphics and link labels (see both (www.mercksharpdohme.com/disease/asthma/home.html and www.disney.com ). Surprisingly, both of these sites used categorical listings for their actual sitemap pages.

Names of Sitemaps

The term “sitemap” has become a generic term among web designers and researchers, but was not found to be the only title given to these navigational tools. Table 3 shows a breakdown of the number of titles used:

Table 3: “Names” of Sitemaps

Name

Frequency

Percentage

site map

246

83.4

site index

33

11.2

index

2

0.7

site guide

4

1.4

A-Z index

1

0.3

site directory

6

2.0

other

3

1.0

Total

295

100

Restricted/expanding menus

Only a small number of sites (2%) utilized restrictive or expanding menus in their sitemaps. One thing to note, however, is that in most cases the restricted or expanding nature of the menu was not always immediately apparent, and users may discovery this aspect with a mouse-over or be required to activate the next level of the sitemap menu by clicking on the topic. In rare instances, the visitor might even be taken to a subsequent page of the map, which could potentially be disorientating.  

Integrated vs. separate

Sitemaps are often a valuable resource, but there is a perception among users that visitors will only use such a tool as a last resort. A small number of sites (2%) provided a sitemap but presented it as a separate item, essentially isolated from the rest of the site. However, the majority of site designers recognize that there is value to integrating the sitemap into the rest of the site and making other navigation tools accessible from that page as well. Nearly 94% of the sites surveyed had some version of the site's navigation bar on the sitemap page itself. Many sites had search engines available (approximately 75%), but sites differed on how easy it was to access it from the sitemap page (see Table 4).
 

Table 4: Search engine availability

Availability from Sitemap

Frequency

Percentage

Search engine on page

96

32.5

Search engine through link

141

47.8

Search engine not accessible

58

19.7

 
Conclusion

The list of Fortune 500 web sites was chosen as a convenience sample and the very nature of the list theoretically restricts the generalizability of this survey to the entirety of all web sites. However, there is no reason to believe that these corporate business sites deviate in any significant manner from the average web site in terms of design—in many ways, these companies have the money and resources to devote to the development of some of the more innovative and cutting-edge sites on the web. Given that, the overall trend of sitemap design seems to be a textual categorical listing of topics.

In addition, site designers seem to generally agree that having some form of the navigation bar and access to the search engine on all pages, including the sitemap, is important.  Only a few sites kept the sitemap separate from the rest of the site by displaying it in a pop-up window. Pop-up windows have a reputation for being disliked by users since they are used mainly for advertisements. However, there may be advantages to the fact that the map does not entirely obscure the current view and can be later accessed without the necessity of backtracking.

Not surprisingly, “sitemap” was the most popular name, with "site index" a distant second. Considering the observed trends, however, “index” might indeed be more fitting. Certainly it is now easy to see why there is controversy over the usefulness of sitemaps as we know them. Do simple collections of topics give users a "cognitive map" of the site? Is that necessarily a requirement for a good sitemap and for good site usability? Certainly the fundamental purpose of the sitemap is to help visitors find the information they are looking for on the site. Whether that is best accomplished by assisting the user to form a mental representation of the site's structure or by presenting a detailed, organized accounting of site content remains to be shownAdditional research needs to be done to demonstrate the true benefit of sitemaps. It is plausible that a well-organized home page may serve the same purpose of providing an overview of a site's structure. There may be better ways to represent web site structures for users, better than a "traditional" map more suited to a physical environment, and better than a simple list of hypermedia destinations. The current trends in sitemap design are, therefore, not necessarily an indication that a textual categorical index is the best way to aid users in navigating websites -- it is simply the most popular with web designers. The potential still exists to find the most efficient means of representing hypermedia and getting the word out to designers before both they become too entrenched in a "tradition" of categorical sitemaps.

References

Bernard, M. (1999a). Using a sitemap as a navigational tool. Usability News, 1.1 . ../usabilitynews/1w/Sitemaps.htm

Bernard, M. (1999b). Sitemap design- alphabetical or categorical? Usability News, 1.2 ../usabilitynews/1s/sitemap.htm

Billingsley, P.A. (1982). Navigation through hierarchical menu structures: does it help to have a map? Proceedings of the Human Factors and Ergonomics Society 26th Annual Meeting, 103-107.

Dias, P., & Sousa, P. (1997). Understanding navigation and disorientation in hypermedia learning environments. Journal of Educational Multimedia and Hypermedia, 6, 173-185.

Farris, J. S., Jones, K. S., and Elgin, P. D. (2001). Mental representations of hypermedia: An evaluation of the spatial assumption. Proceedings of the Human Factors and Ergonomics Society 45th Annual Meeting, 1156-1160.

Kim, H., & Hirtle, S. C., (1995). Spatial metaphors and orientation in hypertext browsing. Behaviour and Information Technology, 14, 239-250.

McDonald, S., & Stevenson, R. J. (1998). Navigation in hyperspace: An evaluation of the effects of navigational tools and subject matter expertise on browsing and Information retrieval in hypertext. Interacting with Computers, 10, 129-142.

Nielsen, J. (2002). Site map usability. Alertbox Column, January 
Retrieved 7/13/02: www.useit.com/alertbox/20020106.html

Stanton, N. A., Taylor, R. G., & Tweedie, L. A. (1992). Maps as navigational aids in hypertext environments: An empirical evaluation. Journal of Educational Multimedia and Hypermedia, 1, 431-444.


Can Personality Be Used to Predict How We Use the Internet?

By Bonnie Lida

At a time when consumers are becoming much more sophisticated and demanding higher levels of product information before making purchase decisions, traditional retail outlets are under pressure to reduce overhead to enhance profitability resulting in declining retail sales personnel (Hill, King, & Cohen, 1996). This scenario, coupled with increasing time constraints of American consumers and increased computer familiarity to more than half of the U.S. population connected to the Internet, has led to the use of the Internet as an alternative to traditional methods for information gathering and shopping. In fact, a U.S. Department of Commerce survey found that of the 54% of the population using the Internet, 55.8 million bought goods or services online (Grant, 2002). This growth in online retailing raises many questions about how businesses can successfully market on the Internet.  

One key difference between online and traditional consumers is that consumers in the computer-mediated environment (CME) have control in all stages of their purchase decision-making process. Traditionally, marketers have distributed their messages to consumers via a mass communication medium, such as television, radio, and newspaper, where the consumer plays no active role other than channel delivery choice. To insure their messages reached the target audience, marketers segmented consumers into homogeneous groups based principally on demographic information.

However, a study cited at the Wharton Forum on Electronic Commerce revealed that demographics alone do not seem to influence whether or not people buy online, or the amount of money they spend there. Instead, comfort level online and time constraints have proven to be better predictors of online purchasing (Bellman, Lohse & Johnson, 1999).

Another divergence from traditional mediums is the ability of the Internet to develop relationships with individuals through an interactive environment. Web designers can customize an online retail site to the individual consumer’s preference by collecting historic usage information. This requires online marketers to shift from the conventional paradigm of marketer control to one of marketer/consumer collaboration (Hoffman & Novak, 1996). As a result, it is now necessary for the contemporary marketer to understand more about the personal characteristics and motivations of the consumer rather than simply age, gender, and income.

Investigating the current empirical approaches to personality provides insight into consumer traits and behaviors when attempting to predict online behavior. Since increased personal control over outcomes has been cited as one of the major differences consumers experience in a CME, use of the locus of control construct seems especially relevant when analyzing online behaviors (Hoffman, Novak, & Schlosser, 2000).

Locus of control (LOC), a personality dimension based on principles from social learning theory is a generalized expectancy about the degree to which individuals control their outcomes (Rotter, 1966). At one end of the continuum are those who believe their actions and abilities determine their successes or failures (Internals); whereas, those who believe fate, luck, chance, or powerful others determine their outcomes are at the opposite end (Externals). 

In general, an Internal LOC orientation is associated with purposive decision making, confidence to succeed at valued tasks, and the likelihood of actively pursuing risky and innovative tasks to reach a goal (Lefcourt, 1982; Hollenbeck et al., 1989; Howell & Avoilio, 1993). Externals, on the other hand, are generally less likely to plan ahead and to be well informed in the area of personal financial management tasks and more likely to avoid difficult situations and exhibit avoidant behaviors such as procrastination, withdrawal, or escape (Dessart & Kuylen, 1986; Aspinwall & Taylor, 1992; Skinner, 1996; Ingledew et al., 1997).

Rotter (1992) considers that any analysis of tasks and events that lead individuals to control an event by perceived skill (internally) or by chance (externally) is a valuable area of LOC study. This would seem to be the case in the study of consumer behavior, as well as computer-based activities. Busseri, Lefcourt, and Kerton (1998) developed a measure of LOC focused specifically on consumer behaviors and outcomes (CLOC).  This study explores the relationship between CLOC and Internet behavior. It was hypothesized that:

Method

Participants/Materials

Five hundred thirty-two volunteers from undergraduate and graduate level business or psychology courses at Wichita State University participated in the study. Data was collected from students via a paper survey in their classroom setting. The survey instrument included three sections. The CLOC Scale developed by Busseri, Lefcourt, and Kerton (1998) was used to assess participants’ LOC levels and consisted of 14, 5-point, continuous-scale questions. The second section contained 20, 5-point continuous scale questions dealing with frequencies and attitudes toward Internet usage, shopping and purchase behaviors. The final portion of the survey queried demographic information including the extent the participant has the role of 'shopper' in the household, age, race, gender, household size and income, and educational level.

Results

The CLOC test was scored using a median split to divide the total pool of participants into two groups (Busseri et al., 1998; Rotter, 1975). Analysis found the median score on the CLOC scale to be 31.00. Of the 532 participants, 3 were not analyzed due to missing data. Of the remaining 529 cases, 248 (47%) were categorized with an Internal orientation (CLOC < 31.00) and 281 (53%) with an External orientation (CLOC > 31.00).   

Demographic Profile

The mean age of the sample was 24 years. Seventy-one percent were Caucasian, 5% African-American, 4.8% Hispanic, 13.5% Asian/Pacific Islander, 1.5% Native American, and 4.3% Other. The respondents were 44% male and 56% female. Fifty-six percent had an annual gross household income under $40,000; 32% between $40,000 to $100,000; and 10% over $100,000. Approximately 65% of those surveyed are responsible for at least half of their household purchases. Nearly 86% of those households were four or fewer persons. The results show that 96% of the respondents have used the Internet for over one year .  

Comparison of Internal and External CLOC

A one-way analysis of variance (ANOVA) was used to compare the dependent variables across CLOC groups.  Table 1 shows a summary of the means and standard deviations. 

Table 1. Mean (SD) Responses by CLOC group

 

Internals

Externals

Significance

Length of Internet usage1

3.85 (.75)

3.62 (.79)

F(1,523) = 11.735, p = .001

Purchase online2
 

3.99 (.96)

4.31 (.76)

F(1,526) = 18.606, p < .001

Dislike shopping on Internet2
 

2.64 (1.32)

3.10 (1.28)

F(1,525) = 16.966, p < .001

Convenient2
 

3.84 (.91)

3.51 (.91)

F(1,527) = 18.164, p < .001

Saves money2
 

3.41 (2.21)

2.99 (.78)

F(1,526) = 8.699, p < .01

Saves time2
 

3.65 (1.06)

3.45 (.94)

F(1,527) = 5.322, p < .05

Use it to compare prices3
 

3.12 (1.17)

3.64 (1.13)

F(1,526) = 26.365, p < .001

Use it to search for product info3
 

2.36 (1.13)

2.70 (1.22)

F(1,526) = 11.237, p < .01

Use it to access financial info3
 

2.94 (1.50)

3.43 (1.45)

F(1,526) = 14.221, p < .001

1 1 = Less than 6 months to 5 = Over 7 years
2 1 = Strongly disagree to 5 = Strongly agree
3 1 = Daily to 5 = Never

Length of Internet Usage: Significant differences between Internal and External LOC consumers were found in the length of time participants have been using the Internet. 

Online Retail Purchase Behavior: Significant differences between Internal and External CLOC were found in online retail purchase behavior. Internals reported more frequent purchasing online than Externals and that they disliked shopping on the Internet less than Externals.

Online shopping attitudes: Significant differences were also found between Internal and External groups in attitudes that Internals reported higher beliefs that Internet shopping was convenient, saves money and time more than Externals  

Online task behaviors:  Significant differences were revealed for goal-oriented tasks between Internals and Externals. Internal consumers used the Internet more often to compare prices, search for specific product information, and access financial information.

The two groups were not found to differ significantly from each other on the following:  uses the Internet for information only; uses the Internet for e-mail communication; is comfortable shopping in stores; buys online for special occasions; only buys certain items online; buys products through online auctions; and buys online when can’t find it elsewhere.

Discriminant Function Analysis

To determine if Internals and Externals could be predicted by their online behaviors, a direct Discriminant Function Analysis (DFA) was performed using the 19 Internet attitude and usage variables as predictors of membership in the CLOC groups (Tabachnick & Fidell, 1996). 

Of the original 532 cases, 11 were not analyzed due to missing data. Missing data appeared to be randomly scattered throughout the groups and predictors. A DFA was performed on the remaining 521 cases; 245 classified with an Internal CLOC orientation and 276 with an External CLOC orientation. One canonical function was used, and produced an eigenvalue of .117. A Wilks’ Lambda of .895 was produced, indicating that there was substantial discrimination between the two groups, Internal and External Consumer Locus of Control.

Discussion

Results of this study provide an alternative view of online consumer behavior, which reveals differing Locus of Control between customers of online sites. These findings present an opportunity to further examine what these differences represent in terms of perceived and actual usability of online retail sites. As more businesses go online, it will be increasingly critical that they have a clear understanding of the online consumer.  This will include not only what their customer’s demographics are, but also their personality characteristics.

As shown in the results of the discriminant analysis summary, the online purchaser has an internal LOC orientation both in attitudes toward Internet usage and frequency of usage. Coovert and Goldstein (1980) found that Internals had more favorable attitudes toward computers than Externals did, which is consistent with this study’s findings of Internals earlier computer adoption and usage in goal-directed activities.

In addition, it was found that the Internals were more goal-directed in their online activities, in that they compared prices, researched and purchased products, and accessed financial information. However, Externals still reported using the Internet experimentally, e.g. as a communications tool/e-mail and having fun/exploring. This has implications for web designers who may want to attract Externals to their site.

Locus of Control Internet Attitudes

Internals were more likely to report that the Internet was convenient, and saved them time and money than Externals. In the CME, the shopping situation is in the consumer’s control. Therefore, when the online shopping experience is approached as a goal-directed activity, the perception may translate into a convenient, time- and money-saving experience. For the consumer with an External LOC, the opportunities for shopping to be an experiential or recreational activity are reduced since most retail websites are geared toward providing a more ‘goal-directed’ type of experience.

Future Research

As previously suggested, the results of this study serve as a foundation toward scientifically testing the usability of website designs and preferences that vary based on the consumer’s LOC. Other personality assessment measures should also be considered in further studies. Since it is much easier for the online consumer to “click” out of the virtual store than to leave the shopping mall, the pressure is on e-tailers to better understand their online customers.

 References  

Aspinwall, L. G., & Taylor, S. E. (1992).  Modeling cognitive adaptation:  A longitudinal investigation of the impact of individual differences and coping on college adjustment and performance.  Journal of Personality and Social Psychology, 63, 989-1003.

Bellman, S., Lohse, G. L., & Johnson, E. J. (1999).  Predictors of online buying behavior.  Communications of the ACM, 42, 32-38.

Busseri, M. A., Lefcourt, H. M., & Kerton, R. R.  (1998).  Locus of control for consumer outcomes:  Predicting consumer behavior.  Journal of Applied Social Psychology, 28, 1067-1087.

Coovert, M. D., & Goldstein, M. (1980).  Locus of control as a predictor of users’ attitude toward computers.  Psychological Reports, 47, 1167-1173.

Dessart, W. C. A. M., & Kuylen, A. A. A. (1986).  The nature, extent, causes and consequences of problematic debt situations.  Journal of Consumer Policy, 9, 311-334.

Grant, E. X. (2002).  Report:  The state of U. S. online shopping.  E-Commerce Times [On-line]. Available: http://www.ecommercetimes.com/perl/story/16233.html

Hill, D. J., King, M. F., & Cohen, E. (1996).  The perceived utility of information presented via electronic decision aids:  A consumer perspective.  Journal of Consumer Policy, 19, 137-166.

Hoffman, D. L., & Novak, T. P. (1996).  A new marketing paradigm for electronic commerce.  The Information Society, Special Issue on Electronic Commerce, 13, (Jan-Mar), 43-54.  

Hoffman, D. L., Novak, T. P., & Schlosser, A. (2000).  Consumer control in online environments.  Unpublished manuscript, eLab, Owen Graduate School of Management, Vanderbilt University :  Nashville , TN.

Hollenbeck, J. R., Williams, C. R., & Klein, H. R. (1989).  An empirical examination of the antecedents of commitment to difficult goals.  Journal of Applied Psychology, 74, 18-23.

Howell, J. M., & Avolio, B. J. (1993).  Transformational leadership, transactional leadership, locus of control, and support for innovation:  Key predictors of consolidated-business-unit performance.  Journal of Applied Psychology, 2, 891-902.

Ingledew, D. K., Hardy, L., & Cooper, C. L. (1997).  Do resources bolster coping and does coping buffer stress?  An organizational study with longitudinal aspect and control for negative affectivity.  Journal of Occupational Health Psychology, 2, 118-133.

Lefcourt, H. M. (1982).  Locus of control:  Current trends in theory and research.  Hillsdale, NJ:  Lawrence Erlbaum.

Rotter, J. B. (1966).  Generalized expectancies for internal versus external control of reinforcement,  Psychological Monographs, 80, 1-28.

Rotter, J. B. (1992).  Some comments on the "Cognates of personal control".  Applied & Preventive Psychology, 1, 127-129. 

Skinner, E. A. (1996).  A guide to constructs of control.  Journal of Personality and Social Psychology, 71, 549-570.

Tabachnick, B. G., & Fidell, L. S. (1996).  Using multivariate statistics (3rd ed.).  New York:  HarperCollins.


Online Shopping for Office Supplies: Factors Impacting User Satisfaction

By Barbara Chaparro, Vanessa Pereira, and Shawn P. Padgett

OfficeMax and Staples have been touted as the top online office supply sites for small businesses (Konrad, 2001). Staples.com, in particular, has been acknowledged for its heavy focus on site usability and demonstrated increases in the number of repeat customers, reduced drop-off rates, increased traffic, and improved shopping experiences by customers after a site renovation in mid-2000 (CHI_India, 2000). Office supply sites are a challenge to web designers because of the huge volume of items available to the consumer (> 30,000 items). The task of categorizing them into a clear, concise, and usable site can be overwhelmingIn this study we asked users to work with three different office supply sites to see how they compared to one another. 

We evaluated participants' user satisfaction, navigational efficiency, and general preference for three office supply sites - staples.com, officemax.com, and vikingop.com. Participants' search efficiency, or 'lostness', was measured by the number of pages traversed beyond the optimum number of pages to complete a task. This efficiency data was gathered by the tracking program Ergobrowser™.

After completing the tasks with each site, participants answered the End-User Computing Satisfaction (EUCS) instrument (Doll, Xia, & Torkzadeh, 1994), which was adapted for web usage and consisted of 12 satisfaction questions using a 1-5 Likert scale. After completing the tasks with all three sites, participants ranked the sites in order of preference. A Pentium II based personal computer, with a 60 Hz, 96dpi 17" monitor with a resolution setting of 1024 x 768 pixels was used.

Method

Nine participants (5 male, 4 female) with an average age of 20.7 volunteered for the usability study. All participants were familiar with the web (67% reported using it daily, 33% reported using it at least a few times per week), but had not visited any online office product sites before. Six participants reported buying from the internet 1 - 5 times in the last year, 1 reported buying more than 30 times, and 2 reported never buying from the internet before. Participants were asked to find seven items on each site and place them in their shopping cart (the order of the items was randomized and the order of the sites was counterbalanced across participants). 

TASK: 

Your boss would like you to purchase the following office supplies online. Find each item and place it into your shopping cart:

a. 12 yellow legal notepads

b. 1 electric pencil sharpener

c. 1 color and 1 black inkjet cartridge for an Epson 740 printer

d. 1 letter opener 

e. 4 black Permanent markers

f. 4 Post-It memo cubes

g. 6 rolls of Poloroid film

At the end of the tasks, the participants were asked to write down the total cost from the shopping cart. For the purposes of this study, participants were asked to find the items without using the site search engine. This was done to better analyze the efficiency and intuitiveness of the site structure.

Results

Figure 1-3 show the average satisfaction, lostness, and success scores for the three sites. Results from a one-way ANOVA revealed no significant difference across the sites [F (2,16) = .124, p > .05]. In addition, no site was significantly preferred over another [Mean ranks: Staples 2.2, OfficeMax 1.8, Vikingop 2.0; Friedman c2 (2, N = 9) = .89, p > .05]. Analysis of the navigational efficiency, or lostness, showed Staples.com and Vikingop.com to be superior to Officemax.com [F(2,14) = 7.06, p < .01] (see Figure 2). Interestingly, however, participants were found to be more successful in finding the items within Officemax.com (92%) than within Staples.com (79%) and Vikingop.com (76%). 

Reported Satisfaction of Office Supply Sites

Figure 1.  Reported Satisfaction of Office Supply Sites

 

Efficiency of Office Supply Sites

Figure 2.  Efficiency of Office Supply Sites

Task Success with Office Supply Sites

Figure 3. Task Success with Office Supply Sites
 

Discussion

It is important to note that none of the users really had significant problems with any of the sites and task success was fairly high. The items most difficult to find on all sites were the Polaroid film and letter opener. In fact, at the time of this test, the only letter opener available on OfficeMax was an industrial, mailroom letter opener for more than $1000. (As a result, the above efficiency and success data were analyzed using only the remaining 6 tasks.) A separate card sorting exercise of 208 office products showed that users expected film to be under a general "Camera Supplies" category and a letter opener to be under "General Office Supplies." Only Vikingop.com categorized the letter opener this way; while Staples.com placed it under "Scissors, Rulers, and Trimmers." Both Vikingop.com and Staples.com placed the Polaroid film three levels deep under "Presentation" materials (which stumped several users), while OfficeMax.com categorized it under Technology.

While our results showed that no site was really preferred over another (for the items tested), the following were noted as important features that contributed to their overall satisfaction of the site:

Preset Quantities - Staples.com offered preset quantities to many items, which saved the users one step when ordering a single unit of an item. In addition to increasing efficiency, this also seemed to make the user more aware of the quantity being ordered. On Vikingop.com, several users ordered 144 yellow legal pads rather than 12, not noticing that they were packaged in 'dozen' units.

Pictures of Items - finding the yellow legal pads was much easier when viewing the item thumbnail pictures than by reading the item descriptions. OfficeMax made the thumbnails readily apparent. Staples.com and Vikingop.com required users to click on a View Picture/Image button or link to see them. Interestingly, very few users took advantage of this option - they did not notice the button or link and searched through the item descriptions instead (most commented later that pictures would have made it easier). This is one example of weighing the pros and cons of optimal page design. With no images, a page downloads faster; however, if the user cannot quickly find the desired item, time (and perhaps a sale) is sacrificed.   

Visual Feedback when Adding Items to Shopping Cart - Staples.com was the only site of the 3 that did not navigate the user to a shopping cart page when an item was added. While this feature can enhance efficiency to the experienced user, it proved to be problematic to the novice user. First-time users did not notice that the shopping cart summary on the right of the page was updated when an item was added. As a result, multiples of the same item were added to the cart until the user either explicitly went to the cart to see its contents, or eventually noticed the cart summary. In fact, several users did not realize how to view the cart contents until they were forced to find it to give the final cart total at the end of all tasks.

Grouping 'Like' Items under Different Categories - finding both of the Epson inkjet cartridges posed a challenge to the users of Vikingop.com, which categorized the black cartridge under "Epson Ink-jet Cartridges I" and the color cartridge under "Epson Ink-jet Cartridges II"! Participants had to guess which category might contain the cartridge they desired. (Since this test, Vikingop.com has changed the path to the cartridges to be more intuitive.)

Special Feature Items - occasionally, some items appeared on a page as a special deal or "Featured Item." It was surprising to us that most users did not choose to select a Featured Item and instead navigated further into the site to choose another instead. In the case of Vikingop.com, a significant portion of the home page was dedicated to advertisements for special offers and deals. None of the users we tested ever showed any interest in even looking at these ads for their desired item, thus, showing some support for "banner blindness" (Benway, 1998).   

Vikingop.com home page.

Figure 1. Vikingop.com home page.

In summary, participants were more efficient with Staples.com and Vikingop.com but were generally more successful in finding the prescribed items with OfficeMax.com. No single site was preferred or showed higher reported satisfaction over the others. Preset quantities, visible thumbnails of the products, and visual feedback when adding to the shopping cart were all cited as factors enhancing overall site satisfaction. Non-intuitive categorization of items (letter opener, film) caused some user-confusion in all three sites.  Designers are encouraged to use activities, such as card sorting, to determine users' categorization of items so that they can be matched in the interface. 

References

Benway, J.P. (1998). Banner blindness: the irony of attention grabbing on the world wide web Proceedings of the Human Factors and Ergonomics Society 42nd Annual Meeting, USA, 1, 463-467.

Chi_India (2000). The Challenge: Staples.com. Retrieved 7/13/02: http://www.indiachi.com/case_studies.htm

Doll, W. J., Xia, W., & Torkzadeh, G. (1994). A confirmatory factor analysis of the end-user computing satisfaction instrument. MIS Quarterly, December, 453-461.

Ergobrowser, Ergosoft Laboratories © 2001.

Konrad, R. (2001). OfficeMax, Staples top list of e-tail survey. CNET News.com. Retrieved 7/13/02: http://news.com.com/2100-1017-254741.html?legacy=cnet


Are You in Good Hands With Allstate?: A Comparison of Three
Auto-Insurance Websites
 

  By Ryan Baker & Candace Gilmore

Purchasing insurance is a big decision. Comparing prices, finding the lowest rates, and deciding which company will provide the best service can all lead to frustration. Recently, several auto-insurance providers have begun advertising campaigns touting the ease of their website services. We decided to perform a usability test to compare three of the major auto-insurance sites to see which was most preferred by first time site users. 

In a comparison of ninety websites, Allstate.com was ranked the #1 insurance website by eMarketer, a leading provider of Internet statistics (eMarketer, 2001). According to eMarketer, the success of Allstate.com was due to “friendly graphics” and language that is easy to understand, as well as easy claim reporting and the option to pay online.

Madison Consulting Group reports that in a study of twelve insurance websites, Allstate.com and Progressive.com were ranked 1st and 3rd, respectively, for offering the best customer experiences (Madison, 2000).

We evaluated participants' user satisfaction, navigational efficiency, and general preference for three car insurance sites – Geico.com, Progressive.com, and Allstate.comParticipants' search efficiency, or 'lostness', was measured by the number of pages traversed beyond the optimum number of pages to complete a task. This efficiency data was gathered by the tracking program Ergobrowser™. 

Allstate.com homepage

Figure 1. Allstate.com homepage
 

Progressive.com homepage

Figure 2. Progessive.com homepage
 

Geico.com homepage

Figure 3. Geico.com homepage
 

Method

Ten participants volunteered for this study (7 female, 3 male). Ages ranged from 18 to 48 with a mean age of 24 (S.D. = 9 years). All participants were familiar with the Web, with 70% using the web 7-14 hours per week or more, but were not frequent users of online insurance sites. Participants were asked to complete four tasks on each site (site order and task presentation was counterbalanced across all participants):

1.  You are interested in purchasing insurance for your car, and are specifically wondering what types of coverage and benefits are available.

2.  Your 15-year-old daughter will be driving soon, and you plan on adding her to your existing coverage.  You want to know what effect this will have on your rates, and whether any discounts are available.

3.  You have already purchased insurance and are wondering if you have a local agent, and if so, who it is.

4.  Before purchasing insurance, you are concerned about what factors might influence your rates (by either increasing or decreasing them).

After completing the tasks with each site, participants answered the End-User Computing Satisfaction (EUCS) instrument (Doll, Xia, & Torkzadeh, 1994), which was adapted for web usage and consisted of 12 satisfaction questions using a 1-5 Likert scale. After completing the tasks with all three sites, participants ranked the sites in order of preference. A Pentium II based PC computer, with a 60 Hz, 96dpi 17" monitor with a resolution setting of 1024 x 768 pixels was used.

Results

Figure 4 shows user preference for the three auto insurance websites. Seven users chose Allstate.com as their number one preference. No users selected Progressive.com as their first choice, and three users chose Geico.com as their highest preference.

Participant preference for Auto Insurance Sites (# participants choosing site as 1st choice)  

Figure 4.  Participant preference for Auto Insurance Sites (# participants choosing site as 1st choice) 
 

Figure 5 shows the average satisfaction scores across all three sites (max satisfaction possible = 60). Results from a one-way ANOVA revealed no significant differences across the three sites for satisfaction [F(2, 18)= 1.62, p = .226), Allstate.com (mean = 43.2), Progressive.com (mean = 34.4), Geico.com (mean = 36.2)].   

Reported Satisfaction of Auto Insurance Sites

Figure 5.  Reported Satisfaction of Auto Insurance Sites

Figure 6 shows participants' navigational efficiency (lostness) across all three sites.  Analysis of lostness (navigational efficiency) showed that there were no significant differences between the three sites [F(2, 18) = 2.299, p = .129.] The trend, however, favored Geico.com. 

Efficiency of Auto Insurance Sites

Figure 6.  Efficiency of Auto Insurance Sites

Figure 7 shows participants successful task completion across all three sites.  Participants completed the tasks successfully 83% of the time with Allstate.com, 77% of the time with Progressive.com, and 75% of the time with Geico.com.  

Successful Task Completion of Auto Insurance Sites 

Figure 7. Successful Task Completion of Auto Insurance Sites 

 
As can be seen in Figure 7, participant success was not particularly different across the sites, and the navigational efficiency was relatively equal (and somewhat high) across all three sites. Neither did the participants find any one site particularly more satisfying than any other site. However, participants did report preferring Allstate.com over either Progressive.com or Geico.com. So, although participants showed no real differences in performance across the three sites, their preference was for Allstate.com.

Discussion

The following were noted by participants as areas impacting satisfaction:

Our study found that users preferred Allstate.com to either Geico.com or Progressive.com. Users were marginally more satisfied and were able to successfully complete more tasks with Allstate.com than the other sites. However, users also traversed more pages beyond the optimal path with Allstate.com than the other sites. This seems to indicate that, although the information took more work to find on Allstate.com, users were more willing to take extra steps to find it and perceived the information as easy to find. Headings that made clear exactly where users were going were cited as reasons for the preference of Allstate.com. Unclear labels, small fonts, and information that was too generalized were criticisms of the lesser preferred Geico.com and Progressive.com.

References

Doll, W. J., Xia, W., & Torkzadeh, G. (1994). A confirmatory factor analysis of the end-user computing satisfaction instrument. MIS Quarterly, December, 453-461.

eMarketer. (2001). eMarketer names Allstate.com best Insurance Website. Retrieved 7/13/02: http://www.mediaserv.com/news-story.asp?year=2001&nid=33.

Ergobrowser, Ergosoft Laboratories © 2001.

Madison Consulting Group. (2000). Property and Casuality Industry E-Competitive Analysis. Retrieved 7/13/02: http://madisoncg.com/pdf/analysis.pdf.


Top Ten Mistakes of Shopping Cart Design

By Barbara S. Chaparro1

Ahhh, shopping. Imagine strolling down the aisle of your local store with your shopping cart. You find an item that you want but before you can put it in your cart you have to give your name and other personal information to a store employee! Then, you want to put an item back on the shelf but every time you try, it appears back in your cart! Then you realize that to really get rid of it you have to state “OK, I want zero of this item!”

Chances are if any of these things happened to you in an actual store you would quickly leave your cart behind.  However, these are only a few of the things shoppers must face to purchase online. Maybe we shouldn’t be surprised that 60 - 75% of shopping carts are abandoned in e-commerce sites (Thumlert, 2001; Gordon, 2000). Have web designers forgotten that the purpose of using the terminology ‘shopping cart’ is so that users assimilate the behavior of a ‘real’ shopping cart to a ‘virtual’ one?

In our Software Usability Research Lab (www.usabilitynews.org), we have examined the usability of many shopping web sites and are always surprised by the inconvenience of ‘convenient’ shopping.  Typically when people shop in a store, they are rarely aware of their shopping cart, unless it has a squeaky wheel or is hard to steer. If this happens, at least they can trade it in for one that runs smoothly. Online users, unfortunately, do not have that choice.

The Shopping Process

Table 1 shows the process you may encounter shopping for two items in a brick-and-mortar store versus a typical e-commerce site.

Table 1.  A Comparison of Traditional and Online Shopping  

TRADITIONAL SHOPPING

ONLINE SHOPPING

1. Find Item #1

1. Find Item #1

2. Place Item #1 in shopping cart

2. Add Item #1 to Shopping Cart

3. Find Item #2

3. View Shopping Cart

4. Place Item #2 in shopping cart

4. Find Item #2

5. Check-out

5. Add Item #2 to Shopping Cart

6. Pay with cash or credit

6. View Shopping Cart

7. Leave store

7. Check-out

 

8. Create an account
    Enter name, email

 

9.   Enter Shipping Address

 

10. Enter Billing Address

 

11. Choose Shipping Method

 

12. Enter credit card info

 

13. Review order & final price

 
It is not surprising, given the nature of online shopping, that extra steps may be necessary at the payment process (Of course, adding the use of a gift certificate, special offer, or shipping to multiple addresses only complicates the online experience even more.) After all, the convenience of not having to get into the car and drive to a store is worth a few extra clicks and keystrokes, right? Popular dot-com companies (i.e., amazon.com) are continuously trying to streamline the buying process by offering predefined accounts and one-click buying. 

However, the process before buying – shopping, browsing, and working with the shopping cart – is in many ways more critical to a site’s success. Users frustrated with the online shopping will never even get to the point of online buying. In our usability studies, we have observed many shopping features that impact user performance and satisfaction. The following is a list of Top Ten Mistakes of Shopping Cart Design that we have compiled.

Top Ten Mistakes of Shopping Cart Design

1.    Calling a Shopping Cart anything but a Shopping Cart. Calling a shopping cart anything other than a shopping cart only causes confusion. Users are accustomed to the cart terminology and while certain domains may find it ‘cute’ to use a term specific to their product line (i.e., bookbag, order, basket) it is best to maintain consistency and stick with the ‘cart.’ Adding a graphic of a shopping cart also helps quick access.

 

add to cart

 

2.    Requiring users to click a “BUY” button to add an item to the shopping cart.  Adding Items to the shopping cart should be effortless and noncommittal. After all, the user is putting items into the cart for possible future purchase. When users are required to click a BUY button to add an item to the cart it is often unsettling since they are not necessarily ready to buy the item at this point – they just want to place it in the shopping cart. Buying is the final step in the shopping experience and it should not be presumed that adding an item to the cart is a commitment to buy. Users in our studies are very hesitant to click the BUY button and search for an Add to Cart button on the page instead.

 

olympus camedia zoom digital camera

  What if a user is not yet ready to buy an item – how does he or she add it to the cart?
 

3.    Giving little to no visual feedback that an item has been added to the cart. Some sites do not automatically take users to the shopping cart page when an item is added. This allows them to continue shopping without interruption. Generally, these sites have a shopping cart indicator somewhere on each page that updates and summarizes the cart content. A problem with this method, however, occurs when the visual feedback of the change to the cart’s content is too subtle or nonexistent, or is not in the users’ current browser view. In all cases, users do not believe anything has been added to the cart. As a result, they click on the Add to Cart button again and add the item a second time (and maybe again for a third time). Users end up having to go to the shopping cart page anyway just to see if the item has been added. Often times, they are surprised with multiple quantities of the same item. 

4.    Forcing the user to view the Shopping Cart every time an item is placed there.  As long as there is adequate visual feedback of the cart’s content, there is really no need to take the user to the shopping cart page every time an item is added. In fact, it is disruptive for multi-item shoppers, requires extra mouse clicks to continue shopping, and potentially limits how many items a person buys (they may be more inclined to checkout if they are already at the shopping cart page).

visual feedback of shopping cart contents

 Visual feedback is very important when adding an item to the cart.
 

5.    Asking the user to buy other related items before adding an item to the cart. This is the online equivalent to “do you want fries with your order?” and is not only irritating to users but also disorienting. After clicking a button or link to add an item to the cart, users are ready for some kind of feedback that the item has been added. Asking them to make a decision about other items makes them second-guess whether they actually pressed the correct button or link to add the desired item, or it aggravates them by soliciting items they do not want. A better approach is to place related items (i.e., batteries) on the item page or on the shopping cart page so they have the option to purchase them before checkout. Placing the control on the users makes them more willing to purchase.

related items

 Users are forced to accept or reject ‘related’ items before adding a desired
item to the shopping cart.
 

6.    Requiring a user to REGISTER before adding an item to the cart. Some sites we have tested require a user to register with personal information before an item can even be placed into the cart! This is a turn-off to users who may be browsing or comparison-shopping. They may or may not purchase the items, but they definitely do not want to commit personal information just to fill the shopping cart and will leave the site because of it.

7.    Requiring a user to change the quantity to zero to remove an item from the cart. Updating the shopping cart’s content can be tricky to program but should be seamless to the user. Many sites still require a user to enter ‘0’ in the quantity field and click an Update button or link to delete the item. Use of a Remove or Delete button next to an item is a far more intuitive way to achieve this.

adjusting quantity of items

“To remove or delete items, change the quantity to zero.” Huh? Why not just delete it?
 

8.    Requiring written instructions to update the items in the cart. Requiring users to read instructions on how to update the shopping cart is, in itself, a sign of poor design. First of all, users do not read such instructions. Second, if instructions are required, then the shopping cart interface design must not be intuitive. Users should be able to figure out how to remove or change the number of items desired from viewing the cart itself.

9.    Requiring a user to scroll to find an Update cart button. Most carts offer an Update button or link to update changes made to the shopping cart (such as quantity). This function should be located such that it is always visible and clearly distinct from the rest of the shopping cart, regardless of the number of items in the cart.

update cartupdate quantities

The Update Cart link (left) may be less evident than the Update Quantities button (right).
 

  1. Requiring a user to enter shipping, billing, and all personal information before knowing the final costs including shipping and tax. Shipping costs and taxes (if applicable) are a big factor in whether or not users complete their online orders. Users cannot access whether their purchase is truly a ‘deal’ or not until they have the final cost. Many sites require users to enter all shipping, billing, and credit card information before a final cost is provided. Access to shipping rates and tax from the shopping cart or item pages (before the user ventures down the purchasing path) is critical.

estimate of shipping and tax

Users prefer to know shipping and tax costs before filling out final payment information.
 

Conclusion

If an e-commerce site is to succeed, designers must consider the usability of the entire shopping experience for its users. Probably the most critical part of this process is the shopping – finding items, adding them to the cart, understanding total costs – and not the buying. Studies show that 51% of online shoppers state that they shop online and purchase offline (NPD Group, 2001). In this article we identify ten mistakes in shopping cart design, which we have seen, impact a user’s willingness to purchase. While these design flaws are not the sole reason why users leave their carts abandoned, fixing them can only improve a users’ willingness to stay online to purchase.  

1 Note: Reprinted from internetworking 4.1
http://www.internettg.org/newsletter/dec01/article_chaparro.html

References

Gordon, Seth (2000). Shoppers of the Web Unite: User Experience and Ecommerce, March 3, 2000. Retrieved 7/13/02: http://www.zdnet.com/devhead/stories/articles/0,4413,2448211,00.html

Thumlert, Kurt (2001). Abandoned Shopping Carts: Enigma or Sloppy E-Commerce? June 27, 2001. Retrieved 7/13/02: http://ecommerce.internet.com/news/insights/trends/article/0,3371,10417_792581,00.html

NPD Group, Inc. (2001). NPD e-visory report shows Offline sales benefit from online browsing. bLINK Magazine.


Using Technology to Foster a Student-Centered Classroom

 By Charles G. Halcomb1 & Nicole L. Rogers

Walk into any traditional classroom and you will see that the physical arrangement of the room is designed such that the focus of the classroom experience is directed toward the instructor. The chairs all face the lectern, where the instructor stands, often in front of a black or whiteboard. In this article, we propose that this traditional model of education, which has served us well since the time of the Greek philosophers, is no longer the most appropriate model for effective learning. 

One way to quickly assess whether any class is instructor-centered or student-centered is to observe student behavior when the instructor is late. In the instructor-centered class, the class passively sits and waits for the instructor. On the other hand, in the student-centered class students come into the room and start working and although they may check to see if the instructor is available to answer questions, the class work starts with or without the presence of the instructor.

In the days before the invention of the printing press, it was necessary for an instructor to be the primary source of information for students and the instructor centered model of education worked well. Later, as democratic ideas fostered the belief that education should be freely available to everyone; the instructor-centered approach provided a means for instructors to work with large numbers of students who came together in the classroom to be ‘taught.  Again, since this allowed much greater access to education this was a reasonable approach. Today, however, the advent of modern computer technology makes it possible to restructure the learning environment and work with large numbers of people, but shifts the focus away from teaching and directs attention to helping students learn. Accordingly, education can be delivered to a large number of students without sacrificing the one-on-one qualities afforded by a Socratic approach.

One does not have to look far to find evidence that there is a need for change. The instructor-centered nature of the traditional classroom has fostered an environment in which responsibility for learning has been shifted from the student to the instructor. Therefore, if a student is not performing well it is not because he or she needs more effective study time. A review of comments made by students enrolled in the undergraduate behavioral statistics course at Wichita State University suggests that there is a fairly general attitude that if one attends class, takes notes as the instructor lectures and attends exams they will succeed. Moreover, if or when they do not succeed it is somehow the fault of the instructor. At a recent graduation exercise, a member of the Kansas Board of Regents speaking to the graduates noted that with graduation they would see many changes in their lives. In jest, he noted that, “among these changes, they would find that there would now be an extra 15 to 30 minutes more each week for other activities, since this time would no longer by needed for study.” Unfortunately, this is far too close to the truth. In a recent study Taraban, Maki, and Rynearson (2000) state that students self-report spending approximately 15 minutes per day in course-related study behavior. The authors note that when students were asked to contrast their study behavior with that of other students, they responded that they thought other students study about the same amount of time that they do, while they believed that the ‘ideal’ student probably would spend somewhere closer to 20 minutes per day in study outside of class. This is disturbing not only because these results suggest that students see this as the norm, but also because they maintain the belief that this is an adequate commitment of time for learning college level material. Moreover, when asked to record time spent learning course-related material, actual study time was clustered in the 24 hours preceding examinations.

The numbers reported by Taraban, et. al. (2000), do not seem inconsistent with our observations, nor with the informal self-report of our students. A review of student evaluations taken over the past ten years reveals that almost without exception students rate the behavioral statistics course as one of the most demanding courses they have experienced. Yet our work is designed to allow for successful completion if students follow the age old formula of spending two hours in study for each hour that they spend in class. We carefully tell students that for a three hour course like statistics this translates to around one hour per day. Clearly any student who spends anything close to that amount of time in active learning activities relevant to the course content will be successful. An article appearing in The Wichita Eagle (June, 2002) in which a new policy of reducing the privileges students receive after their fourth year as a student, University of Georgia Provost Karen Holbrook was quoted as saying, “. . . a recent survey found students average 13 hours a week in class and six more hours studying.”  Therefore, the average University of Georgia student reports spending approximately 4.5 minutes each day outside of class studying. Given the similarity of these findings with those of Taraban, et. al. (2000), there can be little doubt that whatever we are doing, we need to do much more to motivate the student to become actively engaged in their own education.

Technology gives us the potential to embrace a student-centered model of education which clearly focuses on the student, but which would not limit a student’s access to the didactic process or pull the cost of education so high as to make it inaccessible. Since learning is both a personal and active process, the individual student must become the primary agent responsible for implementing this process. Making education more student and learning centered is going to require a redefinition of the role for both the student and the instructor.     

Student-Centered Classroom Using Technology

As students become the center of classroom activity, the role of the instructor changes as he or she becomes a resource for students, a catalyst to help the student find the right combination of activities to promote his or her individual learning. The work that students do in the classroom becomes very similar to the study to that they might do outside of the classroom. The time spent in class becomes more directly aimed at the individual needs of each student and thus more likely to serve to motivate them to work outside of the classroom. As the process becomes more student-centric, the role of the instructor becomes that of facilitator. These new roles for both student and faculty help both parties understand that the instructor can lend a hand as the student learns, but is not responsible for their learning or failure to learn. 

During class the instructor is present to answer questions, help the individual student, and to guide the students as they become engaged in the learning process. Technology enables the instructor to extend this role beyond the classroom. Using e-mail, discussion forums, and even virtual classrooms2. which allow mediated Q&A sessions, students are encouraged to engage in collaborative learning. Thus technology gives the student access to the faculty virtually 24 hours a day. 

Even though the lecture no longer is the focus of the student-centered classroom, we have seen that there are some students who need a formal didactic presentation of the material to help them master complex concepts. Again, technology makes it possible for the student to access lectures, demonstrations or discussions when needed rather than when dictated by a single schedule. These materials can be digitally stored and delivered on-demand via the internet, CD-ROM or DVD.  The lecture may be the same as the one that they would have heard in the traditional classroom setting, but can now be delivered when he or she wants it.  Furthermore, it allows the student to tailor the presentation to meet their own individual needs.   Using the pause control and the rewind control they can back up or stop the lecture while they take notes or look up reference material. This is something we just can’t do when we are lecturing live in the classroom. Technology facilitates this student-centered delivery of even the formal lecture and provides the student the opportunity via e-mail or discussion forums to ask questions and/or collaborate with other students in developing an understanding of the course content. For many students, the ability to ask and receive answers to their questions without having to speak in front of a large group of people is much easier.

Exploring Student-centered vs. Instructor-centered

To test these ideas and confirm that the change to a more student-centered classroom environment would produce the positive results we expected, we initiated a series of case studies examining the attitude, behavior, and achievement of students given the opportunity to learn in this very different class environment. An environment in which each student worked on his or her own with access to a personal computer located on the table at which he or she was seated. We also ran a parallel traditional class to help provide a reference by which to evaluate our outcomes. The traditional class met in a classroom with desks lined up facing the front from which the instructor lectured. It would have been our preference to design true experiments to evaluate these questions, but the realities of the academic environment and the needs of the students precluded the level of control required for formal experimental evaluation. The case study approach provided a format for examining some of the potential consequences which might be expected to accompany the shift from an instructor to a student-centered classroom.

Method

The Course

An undergraduate statistics course required of all psychology majors served as the focal point for this study. We chose this course because the content is perceived by students to be difficult, abstract, and generally unfamiliar. In addition, since statistics is a required course but very unlike most other courses in the psychology curriculum, we thought it was reasonable to assume that for the majority of students the strongest motivation for enrolling was probably the need to meet the curricular requirement. This is also a course that many students struggle with and we thought it might be more sensitive to changes in the course format. But most of all, we chose the course because it is one we teach.

The Students

A total of 81 students during the fall semester 2001 and the spring semester 2002 enrolled in one of two sections of undergraduate statistics. Table 1 displays the distribution of students among the four sections and details the number of those who received grades in the course. Table 2 displays the number (percent) of students who dropped the course by class format.      

Table 1.  Final Enrollment by Semester and Class Format

 

Traditional

Student-Directed

Spring 2001

10

8

Fall 2001

19

12

 

Table 2.  Class Withdrawal (%) by Semester and Class Format

 

Traditional

Student-Directed

Spring 2001

9 (47%)

5 (38%)

Fall 2001

12 (38%)

6 (33%)

 
Results

Using the data, for students who actually completed the course, a Chi Square was performed to determine whether there were any differences in the number of drops among the students in the two class formats ( χ2 (1) = .34, p = .56). This non-significant finding was somewhat surprising as we had formed the impression that the student’s were more likely to drop the traditional-format class than the student-centered format. We were particularly interested in this outcome because of the high rate of drops among students enrolling in the statistics course. Looking at these data clearly suggests that dropping the class is a fairly common method of dealing with poor performance.

A final Chi Square took the form of Fisher’s Exact Test of Probability because of the large number of cells which contained small numbers. This test was designed to determine if there was a relationship between the class format and the grades assigned (see Table 3). For this analysis, we counted the number of students in each class formats who earned each of the grade categories. The test did not provide any indication that such a relationship existed (χ2 (4) 1.0, p = .95).  

Table 3.  Frequency of Grades Earned by Class Format

 

A

B

C

D

F

Traditional

9

13

5

1

1

Student-Centered

7

7

4

1

1

 
The mean total points earned by students in the class with the traditional format was 231.86. The lower and upper .05 confidence limits were 197.71 and 266.02 respectively. The standard deviation for these students was 89.787. For the students in the student-centered group the mean was 228.7 with a lower boundary of 182.22 and an upper boundary of 275.18. A look at these numbers suggests that the null hypothesis cannot be rejected. The difference between these means was 3.1521 with a lower and upper 5% confidence boundary of -51.66 and +57.985. The only logical inference to be made is that there were no differences found.

Looking at these two standard deviations the thought occurred to us that we might have a reliable difference between the variances for these two groups. An F ratio was computed, but once again it failed to reach a level even close to significance at the .05 level.

Discussion

The purpose of this study was to demonstrate the role that technology could play in facilitating a more student-centered model for the university level classroom. When this model is followed, each student works at his or her own pace with the instructor along side them as they work to master the sometimes complicated and abstract course concepts. The instructor is no longer the source of information and the purveyor of wisdom, but rather an ally in the war against ignorance. Using technology, we developed two distinct models of classroom instruction. One, was a technology enhanced version of the traditional lecture format, the other model was one where each student moved through the material at his or her own pace, guided and helped by an instructor and teaching assistant who responded to the students, helping in whatever ways possible, but not serving as the center of attention for all of the students in the classroom. The traditional class, met in a traditional room with desks facing the front with the instructor as the center of attention. In the other class, the students worked at tables with each student having an IBM PC with appropriate software installed on the table and available for them to use. The instructor and teaching assistant moved about randomly responding to student interaction or providing encouragement and feedback as appropriate. On some days, the teaching assistant was present in the room, while the instructor interacted with the students by being available for a private or public chat session in which the student could ask questions or request an explanation of some difficult or troubling concept. What we have reported here are the results of our effort to investigate what happens to student performance when these changes were implemented and more specifically what differences could be observed. 

It is important to note that with the exception of what happened in the classroom, students in these two class formats were treated alike. Technology can enhance learning regardless of the style of instruction during the formal class period. What technology does for the student-centered classroom format is to make it possible to administer a class in this fashion without the need to reduce the size of enrollment. By providing students with their own computer, a schedule of activity, and lots of encouragement and support, they are clearly forced to use the class time actively engaged in learning.

The results reported here suggest that however one wishes to look at these students' performance, there was no discernable performance difference between these two classes (at least not in terms of the objective measures we used). While we refer in this paper to the lecture format as the traditional format, we must emphasize that in many respects neither of these classes were traditional, since both were facilitated by using the internet and other computer tools. Both of the classes described in this paper made extensive use of technology. In fact, every effort was made to insure that these two classes were as similar as possible with the exception of what happened during the scheduled class period. Both classes:  

  • had access to recorded lectures synchronized with the PowerPoint presentations used in the lecture class. The lectures were actually recorded in a lecture class taught the previous semester.  
     

  • used the Blackboard online education support system (www.blackboard.com) to access a large quantity of outside reference material and to receive instructor feedback on their performance. This important instructional management tool was important as it facilitated for both classroom formats the use of e-mail, discussion groups, computer testing, etc.  
     

  • were provided with ‘practice quizzes’ which they could take as often as they wanted, with items selected randomly from the same item pool as the chapter quizzes and from which the in-class exams were drawn. While these quizzes earned the student no points, they afforded the student with an opportunity to see the questions which they ultimately would see on exams that did contribute toward their final grade total. This of course, pointed the student to the material the instructor wanted the students to master.  
     

  • used the internet to receive assignments and chapter quizzes. All assignments and chapter quizzes were self-paced with the exception that there were deadlines after which the assignment would not be accepted. These deadlines were the same for both classes and synchronized with the lectures being given in the class where the students heard the traditional lecture. Chapter quizzes were also self-paced, but unlike the ‘practice quizzes’ they could only be taken one time and did contribute points toward the students' final grade.  
     

  • were encouraged to use e-mail to communicate with the instructor. Every effort was made to respond to the e-mail as quickly as possible. The students were also urged to communicate with one another via e-mail although this rarely happened.  
     

  • were invited to participate in ‘out-of-class’ virtual study sessions conducted with a chat facility embedded in the Blackboard system. It was explained that the teaching assistant worked late most evenings and would monitor the class chat facility and be available to answer any questions which might come up. This also was used far less than expected.  
     

  •  took their major exams via computers located in the classroom where the student-centered classes met.  

Both classes were computer facilitated. What differentiated them was what occurred in the classroom. Since we found no differences in any objective measure between these two classes, we are left with the inevitable conclusion that student performance is not determined by what the instructor does or does not do in the classroom, but rather by what the students do.  It is now our belief that the format of classroom activity is less important in determining student performance than we originally thought. We had expected the students in the student-centered class format would perform better than the students in the traditional class setting. This didn't happen, but we hasten to add that the students in the class where each worked more or less on his or her own also did not do any worse. 

One bit of good news is that in demonstrating that it really doesn’t matter whether or not there is a lecture we support the notion that it may be possible to offer the off-campus or distance education classes even for important and difficult classes with confidence that the student is getting the same opportunity to learn as does the resident student. Although this was not the focus of our study, this is an implication of our findings that should not be overlooked. Given economic and other pressures to provide more opportunities for students to obtain off-campus coursework, this is an implication which could have important ramifications. Perhaps there is room for some commentary at this point on the fact that many schemes used to evaluate teaching focus on how students react to what the instructor does in the classroom. Since what the instructor does may not matter, it might be more important in evaluating the effectiveness of instructors to evaluate the extent to which they are able to form a working relationship with the students and impact the students desire to learn. Understandably, this is difficult to evaluate, but the ability to get the students to invest time in learning may be the single most important thing that an instructor can do. The ability to communicate to the students that they are important and that they can learn the material if they just try. It is our experience that when one moves to this type classroom environment the most time consuming part of the task is the investment of time making oneself available to students both in and outside of the class. It is easy to deliver a well-prepared lecture. It is much more difficult to be ready to respond to whatever questions or concerns the students, both collectively and/or individually, but we think that in this environment, knowledge of the content and a genuine concern with whether or not the student learns are the major ingredients to effective teaching.

We feel that the most important information gained from observing these two distinct classroom styles was that regardless of what happens in the classroom, it is the student who determines whether or not learning occurs. We now believe that in many respects, it really doesn’t matter what the instructor's mode of delivery since it is the student who must learn and it is the student who will decide whether or not to engage in the educational enterprise. Beyond the basics of providing the student with the tools with which to learn, the most important strategy the instructor should adopt is to facilitate a change in attitude and provide the student with incentive to actively participate in the learning process. It has been our experience that any student who will take up the gauntlet and accept responsibility for his or her own learning will be successful. It is in meeting this challenge that we believe the student-centered approach will eventually prove most effective, if for no other reason than it affords the student with fewer scapegoats to which blame for his or her failures.  Therefore, tilting the playing field such that it favors students for whom learning is more important than merely obtaining a degree.

1. Charles G. Halcomb is a Full Professor in the Department of Psychology at Wichita State University.

2. The virtual classroom can take several different forms including informal electronic chat to more formal question and answer sessions which can be conducted in synchronous (real-time) or in an asynchronous (bulletin board) format.  

References

" University of Georgia Gives Slackers a Demotion" Wichita Eagle (KS), June 17, 2002, p. 2A Associated Press.

Taraban, R., Maki, W., & Rynearson, K. (1999). Measuring study time distributions: Implications for designing computer-based courses. Behavior Research Methods, Instruments, & Computers, 31, 263-269.
 


Bringing the Chat Room to the Classroom

   By Mark C. Russell & Charles G. Halcomb1

One of the more interesting —and perhaps underused—tools provided by the Blackboard online education support system (www.blackboard.com) is the “virtual classroom.” For those unfamiliar with Blackboard, it is a web-based server software platform which is used by many institutions, including Wichita State University, to supplement traditional classes. Blackboard provides many tools and resources for both instructor and students, including ways to organize assignments, lecture notes, external world wide web links, etc. The “virtual classroom” is essentially a 'chat' application with access restricted to students enrolled in the class. Students are able to log in from any computer with internet access and a java-enabled web browser.

Virtual Classroom picture

(click to enlarge)

In the Spring 2002 semester, we used the virtual classroom on several occasions in a graduate-level Human Factors psychology course to discuss various topics related to Human-Computer interaction (HCI). The purpose of the discussions was to stimulate students in the class to think about issues surrounding topics presented in the text, and to motivate the students to read and think about the material from the text and other sources. We were interested in exploring the communication patterns of the class in terms of amount of participation on and off-topic, the role of a designated mediator, and the role of the course instructor and contrasting this with the face-to-face discussion experience.

Table 1. Course Details 

Course name:  

Software Psychology; a seminar course in the human factors doctoral program offered through the Psychology Department at Wichita State University.  The instructor sometimes refers to the course as Introduction to Human-Computer Interaction.

The forum:

Blackboard’s virtual classroom chat software (provided by Tutornet.com, Inc.). Students logged in from remote locations or from computers located in the actual classroom.

The task:

The class was instructed to discuss a specific topic related to human-computer interaction; not a formal debate—opinions welcome, evidence not required but encouraged.

Instructions:

A student was designated as facilitator and asked to begin and guide the conversation. The instructor participated as well.

Method

We reviewed the transcripts of two class sessions (automatically archived by Blackboard) and recorded the following:

  • number of posts (individual segments of text submitted for others to see on the computer screen) made by each participant;
     

  • type of entry, whether they be on or off-topic, etc.;
     

  • the differences between how the designated mediator compared in participation to the actual class instructor.

In addition, we interviewed the students about their personal impressions of the virtual classroom, and compiled the results into a table outlining the perceived advantages and disadvantages of this classroom format.

Analyzing the sessions

In order to standardize the analysis, the following operational definitions were used:

(1) Contributory comments: any posting which appeared to be adding to the conversation at hand in some meaningful way, whether it be new information, opinion, or even encouragement or agreement with the postings of others.  Almost any 'on subject' comment fell into this category.

(2) Miscellaneous comments: the ‘aside’ bits of conversation that served no actual purpose, such as jokes, greetings, and other class business unrelated to the topic of interest. 

(3) Guidance comments: questions or statements which directed conversation in some way, posted either by the facilitator or the instructor.

Any posting of text by a student, regardless of length, was included in the count and categorized according to content.

Facilitator vs. Instructor

In an effort to encourage student participation and differ the virtual format from a more traditional instructor lecture, a student was designated as the facilitator for each session. This facilitator was to make suggestions for material to discuss and keep the overall discussion on track. The facilitator also added to the conversation and invariably made miscellaneous comments just like everyone else. However, the instructor also attempted to direct the conversation at various times.

It was very difficult to precisely determine exactly how many conversation 'threads' were in existence at any one time, nor was it exactly clear to what extent there was any effect on the conversation by the facilitator/instructor competition. Therefore, for our purposes here we simply stuck to the numbers of posts and their general classifications.

Results

Table 2 shows a summary of the number of various types of posts made by the students, facilitator, and instructor during the two sessions examined. This rough analysis is not an attempt to compare the two instances, but rather a simple attempt to get an idea of individual/group participation. Keep in mind, the only assignment for the students was to engage in a meaningful discussion about the assigned topic; there was no other specific task, and it was not a formal debate. Tables 3 and 4 show specific details for Session I and II. 

Table 2. Summary of participation by session.

 

Session I

Session II

Total contributory posts

272

247

Total miscellaneous posts

51

112

Facilitator “guidance” posts

8

10

Instructor “guidance” posts

9

7

Total posts

340

376

Average number of posts

26

29

Table 3. Session I Details
 

When:   

Near the beginning of the semester

Duration:  

1 hour and 30 minutes

Participants:   

11 graduate students and the course instructor

Topic: 

The role of models/theories in the practical world

Student Participation:    

Lowest number of posts: 3
Highest number of posts: 68 (facilitator)

Comments: One of the expectations of the virtual classroom environment was that student participation would be increased or at least made more equal. One of the ways in which virtual environments usually foster this equality is through the ability of participants to submit posts simultaneously and by keeping the author of the post anonymous. In this session, we had the former, but not the latter; students were identified on-screen by name. There was no reason, however, to keep the students anonymous, nor is there any reason to believe that participation would have changed had anonymous posts been permitted.  More likely, participation was dictated more by the student’s comfort with the conversation topic and other less technology-related reasons.
 

Table 4. Session II Details 

When:   

Approximately six weeks after “Session I”

Duration:  

1 hour and 18 minutes.

Participants:   

12 graduate students and the instructor.

Topic: 

A comparison of  “natural” versus computer languages.

Student Participation:    

Lowest number of posts: 2
Highest number of posts: 58 (facilitator)

Comments: There are no striking differences between the two sessions, except for an increase in the amount of miscellaneous or off-topic comments made by the students in Session II. This is certainly not an unnatural thing to occur, as the semester had progressed and the students undoubtedly had become more comfortable, not only with each other, but with the virtual format as well. In addition, these miscellaneous comments accounted for just under 30% of the total number of posts. Whether this seems an inordinate amount of side-talk would of course be a matter of opinion. But anyone who has stood in front of a classroom full of eager, young minds and asked for comments only to be confronted by a sullen silence disrupted only by the plaintiff chirpings of a lone cricket might eagerly take this necessary evil in exchange for increased class participation. Also, since each participant was seeing the posts of the other participants typed onto a computer terminal, it would seem likely that irrelevant comments could be more easily ignored than would be the case in a face-to-face discussion.

Student Impressions

At the end of the semester students were interviewed on their impressions of the virtual class experience as they compared to the traditional face-to-face class sessions. Table 5 summarizes their comments in terms of perceived advantages and disadvantages of this format.

Table 5. Student Perceptions of the Virtual Classroom

Advantages

Disadvantages

  • Allowed more participation.

  • Participation voluntary—'quiet people' could still stay quiet

  • “More” talk does not necessarily mean “better” talk

  • Reading/typing time delay gives you time to compose your thoughts

  • So much text to read; you could easily get behind before posting a response

  • Class discussion was informal, comfortable

  • A more formal moderator arrangement might work better, eliminate excess talking

  • Instructor participation was welcome

  • Instructor's comments and questions shifted conversation dramatically; instructor was answered instead of facilitator

  • International students participated more than in regular class, possibly understood better

  • Misspellings can cause confusion

  • Not hard to stay on topic; side comments not perceived as excessive in number

  • Several conversations going at the same time can be confusing

  • Able to type in comments as you think of them without waiting

  • Lack of non-verbal information; could not detect emotional nuances; not always sure about jokes

  • Time seemed to pass quickly; discussion more engaging than lecture

  • Discussion could seem aimless depending on facilitator style and task specificity

 
Instructor Impressions

The instructor of the course was also asked to provide his impressions of how the 'virtual' discussion compared with the actual face-to-face discussion and how well, in his opinion it had worked. His impression was that the virtual discussion had elicited considerably higher volume of comments from a more representative cross-section of the class and that for the most part the discussions stayed focused on target. He did comment that in the future, it might be possible to structure the discussion in such a way as to make the outcome even more consistent with the goals. He was interested in the extent to which his presence impacted the outcome and has suggested that in the future we take a look at how the role of the instructor should best be formulated to achieve the best possible outcome.

Discussion

By no means can one presume that we have provided any definitive answers to the question of whether or not there is utility in the use of the computer-mediated discussion group in a classroom setting. However, it is our hope that you will be struck, as we were, by the generally positive reaction to the experienceThe instructor thought that the virtual sessions were successful. The goal had been to increase participation by all students and provide the motivation to learn the course content throughout the semester without the need of an examination. The observations of the students and instructor as well our survey demonstrate that their was an increased participation in the discussions.

We have noted that in our sessions there was a facilitator and that the instructor participated. While it remains for future studies to define and evaluate the impact of instructor participation, it is clear from this experience that, whatever the consequences of instructor participation, the instructor was able to participate without completely dominating the discussion as so often happens when the discussion is face to face. Also, looking at it from the viewpoint of the instructor, having the ability to document the participation of each individual student and ask that all of the students review and evaluate what took place during the discussion was an important and unexpected benefit afforded by this format. If there were no other positive benefits, the instructor felt like the virtual sessions were successful.

We all were surprised by the students participation in the process. Even those students who verbalized a dislike for the more 'impersonal’ nature of the communication participated actively in the discussionsAnother major advantage of the procedure was that students could be asked to review the archives following each session and submit a summary or evaluation of the discussion as an individual project. This ability to ‘replay’ the discussion and reflect on or add comments after the fact was something that was viewed by both students and instructor as a positive outcome. Even the simple act of summarizing the threads of conversation that had occurred served to reinforce the positive aspects of the experience. 

In summary, this single classroom activity produced two behavioral products: (1) first, the virtual group discussion itself, complete with the increased motivation for individual participation and the benefits of collective group insight on the discussion topic; and (2) the capacity for the individual student to go back and summarize and reflect on the experience served to extend the benefits of the activity beyond the classroom and served as a catalyst for many of the students to help in the consolidation of the ideas and thoughts which had emerged during the discussion. It seems likely that, with an awareness of this dual capability afforded by this format, in the future, these sessions can be formulated to take better advantage of the group collaboration with full understanding that an individual response will also be required.

These virtual discussions took place in the classroom in the context of an ‘on-campus’ course where the student met in a face-to-face setting, and interacted with the instructor in an actual classroom setting, the virtual classroom format as described in this paper could also be developed to provide an enriched interactive opportunity for the student who might be taking the course as a ‘distance education' course. This becomes more important as economic and other factors conspire to make the off-campus course more and more necessary for many seeking to further their educational experience.

There are, of course, questions which still beg for answers.  For example,

  • Do the multiple threads of conversation commented on by a number of the students, actually distract, or are the just an annoyance which with time will become an experience that will drop from awareness?  

  • How can the sessions best be structured to take advantage of the archiving feature to extend the activity beyond the group discussion to an individual learning effort?    

  • To what extent should the instructor participate?   

  • In what way does participation by the instructor add or detract from the educational benefit of the experience? 

These are all questions that a more formal examination of this virtual classroom should help answer. The internet is often billed as the ‘information highway’, so  let us find ways to use tools such as 'chat' forums to help us 'transport' more educational opportunities to the student.

1Charles Halcomb is a Full Professor in Human Factors psychology.


 

Usability News Contents

©Software Usability Research Laboratory
Department of Psychology
Wichita State University
Wichita, KS 67260-0034

Phone: 316-978-3683
Fax:     316-978-3086
URL: http://www.surl.org

  
SURL Home | Usability News | WSU HCI Lab | WSU Human Factors | WSU Psychology | WSU Home


Contact Ryan Baker or Barbara Chaparro questions regarding this site.
Last update: October 23, 2005

Disclaimer: Neither the Software Usability Research Laboratory, The Wichita State University, nor any agency thereof, nor any of their employees makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately-owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the Software Usability Research Laboratory, The Wichita State University, or any agency thereof. The views and opinions expressed herein do not necessarily state or reflect those of  Wichita State University, or any agency thereof.