SURL Home > Usability News

 

Editor's Notes

In our eighth issue of Usability News: 

If you have a research study that you would like to do, but don't have the time or resources to do it, please contact us. We will do the research for you.  As always, we thank you for your feedback and comments.

Usability News is distributed to over 2500 usability professionals, developers, managers, and researchers in over 59 countries. Contributions and suggestions for future issues should be directed to barbara.chaparro@wichita.edu


SURL Home Page:  www.surl.org
Usability News:  www.usabilitynews.org 
Designing for usability:  www.optimalweb.org 


Reading Online News: A Comparison of Three Presentation Formats

 By Ryan Baker, Michael Bernard, & Shannon Riley

With the ever-increasing progression towards online newsletters as a principal source of information presentation, the Web has offered many opportunities as well as challenges that are unique to this environment. For instance, the traditional newspaper presents information within the confines of evenly-spaced, gridded columns. This has worked quite well in the past, and readers have become very accustomed to this style of information presentation. However, with the advent of the Web, it is now possible to place information in multiple sources that are connected by link titlespermitting online newsletters to initially present only a small amount of pertinent information through the use of these links. That is to say, online newsletters may only need to present links that provide enough information to give the reader a general idea pertaining to that article. This, obviously, might reduce the amount of information clutter that the reader has to initially wade through. Yet, unfortunately, little is known about the most efficient, as well as the most preferred way to present information within this type of medium. Accordingly, this study addressed the question of how information should be presented within a news-style web page. For example, should all the information related to a single article be presented on one page, or should the newsletter contain a page that lists only the link titles that relate to each specific article, and which is presented on another page? Moreover, if the newsletter presents initial information in the form of link titles, should they present supplementary information that provides a general overview of the entire article, along with the link title?

Method

A Pentium II based personal computer, with a 60 Hz, 96dpi 17" monitor with a resolution setting of 1024 x 768 pixels was used. Content for the articles came from the New York Times website (www.nytimes.com). The participants’ performance was tracked by using Ergobrowser™ software.

Participants

Twenty-one participants (5 males, 16 females) volunteered for this study. They ranged in age from 18 to 47, with a mean age of 26 (S.D. = 9 years). The median Web use for the participants was 7-14 hours per week (94% used the Web a few times per week or more).

Procedure  

Users were asked to locate specific information within news articles on three different layouts: full text (Full), link titles plus abstracts (Summary), or link titles only (Links).  Each of the layouts contained information on different domains (sports, health, and science). The Full condition presented twelve full articles on one page (see Figure 1). The Summary provided a short summary of approximately two to three sentences for each of the twelve articles on one page plus a linked title to the full article (see Figure 2). The Links condition provided just the linked titles for the twelve articles on one page (see Figure 3).

Participants searched for information within all three conditions. For each layout, they were presented with ten different search tasks asking them to find specific information (For example, "What type of device is a Large Hadron Collider?") within one of the articles. After finding the information, participants were asked to highlight the information they believed was correct with their cursor. If the answer was verified as correct, participants would perform the next search. Information had to be found within the five-minutes to be considered correct. Participants were allowed to search using the links, as well as the “forward” and “back” buttons until the time expired. The layouts, domains and search terms were all counterbalanced using a Latin square design. The layouts were stored on a local server, allowing instant access to the pages in all conditions.  

full condition

Figure 1. "Full" condition
 

summary condition

Figure 2. "Summary" condition
 

links condition

Figure 3. "Links" condition 

After finishing all the questions for each condition, participants answered a satisfaction questionnaire. The questionnaire consisted of a 6-point Likert scale, with 1 = “Disagree” and 6 = “Agree” as anchors. The questionnaire items were: "The layout made it easy to find information," "This site was visually pleasing", "The arrangement of this site promotes comprehension," "I am satisfied with this site," and, "The layout looks professional." After participants completed the questionnaire for all conditions they ranked the three layouts for general preference.

Results

A within-subject ANOVA design was used to investigate participant performance (Mean task completion time and search accuracy) and perceived ease of use of the three conditions. Preference for all three conditions was analyzed using a Friedman c2.

Task Completion Time

Evaluation of the time (in seconds) taken to complete each of the tasks revealed no significant differences between the three groups [F (2,40) = 1.007, p = .37] (S.D. Full = 326.07, S.D. Summary = 239.86, S.D. Links = 354.23; See Figure 4).

Mean Task Completion Time (in seconds)

Figure 4. Mean Task Completion Time (in seconds)

Perceptions of Site Efficiency

Easy to Find Information

Significant differences were found in the perception that a particular condition was easier to find information [F (2,40) = 4.966, p < .01] . Post hoc analysis indicated that participants perceived the Summary condition as being easier to find information than the Full condition (See Figure 5).

Easy to Find Information  (1 = Disagree and 6 = Agree)

Figure 5. Easy to Find Information (1 = Disagree and 6 = Agree)
 

Arrangement Promotes Comprehension  

Significant differences were also found for the perception that a particular layout promoted comprehension [F (2,40) = 6.321, p < .01], in that participants perceived the Summary condition as being more conducive to comprehension than the Full condition (See Figure 6.).

Arrangement Promotes Comprehension  (1 = Disagree and 6 = Agree)

Figure 6. Arrangement Promotes Comprehension (1 = Disagree and 6 = Agree)
 

Satisfied with Site

Moreover significant difference were found for participant satisfaction between the conditions [F (2,40) = 3.309, p < .05], in that users indicated that they were more satisfied with the Summary site than the Full site (See Figure 7).

Satisfied with Site  (1 = Disagree and 6 = Agree)

Figure 7. Satisfied with Site (1 = Disagree and 6 = Agree)

Looks Professional

Significant differences were also found for the perception that a particular condition looked more professional-looking [F (2,40) = 5.621, p < .05], in that the Summary condition was perceived as more professional-looking than the Full condition (See Figure 8).

Site Looks Professional  (1 = Disagree and 6 = Agree)

Figure 8. Site Looks Professional (1 = Disagree and 6 = Agree)

Layout Preference  

Four participants chose the Full condition as their number one preference. Fifteen participants selected the Summary condition as their first choice, and two participants selected the Links condition as their highest preference (See Figure 9).

Site Preference (participants ranking site as their first choice)

Figure 9. Site Preference (participants ranking site as their first choice)

Discussion  

Overall, there were no statistical differences in search time across the three presentation types. However, the Summary condition was perceived most positively in terms of ease of finding information, being visually pleasing, promoting comprehension, participants' satisfaction with the site, and looking professional. The Summary condition was also the most preferred. The Full condition was the least preferred, and had the most negative perceptions associated with it. The Full condition was perceived as being most difficult to find information, not promoting comprehension, not being visually pleasing, and not being satisfying.

Participants reported that they preferred the Summary condition over the Links only condition because the brief summaries accompanying the headline links often guided them to the information they were searching for. Participants commented that, in the Links condition, they sometimes felt as if they were "jumping blindly" into the article. Several participants also reported that they did not like having to scroll through all of the articles in the Full condition. This study suggests that providing a small amount of information about an article on a page is superior to having long, scrolling pages filled with articles.

Reference

Ergobrowser™, Ergosoft Laboratories © 2001


Examining the Effects of Hypertext Shape on User Performance

By Michael L. Bernard1,2

Studies examining the depth and breadth of hypertext structures have consistently found that increasing its depth correspondingly decreases its search efficiency. This is typically reflected in increased user search time, disorientation, and error, along with reduced satisfaction (e.g., Jacko, & Salvendy, 1996; Kiger, 1984; Larson & Czerwinski, 1998; Snowberry, Parkinson, & Sisson, 1983; Zaphiris, 2000). For example, Snowberry, et al. (1983), who examined four structures that consisted of 64 menu item choices on a single level (64); four menu items per level at a depth of three levels (4 x 4 x 4); eight menu items per level at a depth of two levels (8 x 8); and binary menu items at a depth of six levels (2 x 2 x 2 x 2 x 2 x 2), found that as the degree of hypertext depth increased from one to six levels, the rate of user errors rose from 4.0% to 34%. Likewise, Kiger (1984) found that increasing depth from two to six levels increased its user error rate from 2.2% to 12.5%. Participants also favored the shallowest structure over the deepest ones (Norman, 1990). It has therefore been recommended by numerous researchers that the design emphasis should be placed on reducing the overall depth of a hierarchy by correspondingly increasing the overall degree of menu item breadth.

However, because there is a practical limit to the degree of breadth (as well as depth) within a hypertext, compromises must take place within the design of a structure. To do this, depending on its contents, most hypertexts have several levels of depth with varying degrees of breadth for each level (i.e., having generally expanded breadths at some levels, while having constricted breadths at other levels). Consequently, it is more relevant to hypertext design to concurrently examine the overall shape of a structure by assessing its breadth at each hierarchical level.

Unfortunately, very few studies have examined the ‘shape’ of hypertext structures by varying the amounts of breath over several levels of depth. The most notable exception to this is a study conducted by Norman and Chin (1988). This study examined hypertext shape by assessing five different tree structures of varying breadths, while keeping the depth invariant at four levels. The five structures examined were: a constant breadth (4 x 4 x 4 x 4) structure, comprising of four menu items choices for each level; a decreasing (8 x 8 x 2 x 2) structure, with eight menu items in the first and second levels and two menu items in the third and fourth levels; an increasing (2 x 2 x 8 x 8) structure, with two menu items in the first and second levels and eight menu items in the third and fourth levels; a concave (8 x 2 x 2 x 8) structure, with eight items in the first and fourth levels and two items in the second and third levels; and a convex (2 x 8 x 8 x 2) structure, with two items in the first and fourth levels and eight items in the second and third levels.

Participants were instructed to search a simulated electronic commerce hypertext for either explicitly named items, or items implied in a scenario situation in which they were to find the most appropriate answer—such as to search for a gift item “for a pilot always on the go with time tables to meet.” In searching for explicitly named items, search times were similar across all structures. However, for implicit targets the different shapes did have a significant effect on participants’ search time and search performance. For these targets, participants using the concave (8 x 2 x 2 x 8) structure took less time, searched fewer nodes to find the target items, and used fewer ‘Back’ commands, suggesting a lower degree of disorientation than the other menu tree shapes. For explicit targets, the increasing (2 x 2 x 8 x 8) structure facilitated slightly less navigational disorientation than the other structures.

The convex (2 x 8 x 8 x 2) structure produced the poorest performance for implicit targets. Participants using this structure had slower search time, searched more nodes (web pages) than the other structures, and used the Back command significantly more often than the concave structure.

Yet, as interesting as the Norman and Chin (1988) study is, it did not address the important interplay between hypertext shape and depth. To address this issue, the present study examined six different hypertext shapes at different levels of depth. Moreover, unlike the method used by Norman and Chin, where the number of terminal level nodes remained invariant (256 nodes) across conditions while the total number varied from 294 to 456 nodes, this study sought to have approximately equal number of nodes across hypertext conditions—since smaller structures are generally easier to search than large structures. In the present study, the general questions asked were 1) which factor, hypertext breadth or depth, has the greatest effect on user performance, and 2) which type of shape promotes the greatest user performance when general hypertext size remains relatively constant?

Method

Participants

One hundred and twenty undergraduate students between 18 and 52 (mean = 22) years of age volunteered to participate in this study. Participants reported using the Web at least once per month (96.7 percent reported using the Web a few times per month or more and 79.8 percent reported using the Web two or more hours per week). No significant differences in reported computer usage and anxiety, as well as Web use, were found among participants within the different conditions.

Experimental Task

Participants were assigned to one of six hypertext conditions. Both the search tasks and the hypertext conditions were ordered by means of a Latin square design. Each task required them to search the presented hypertext for a specific merchandise item that would most appropriately satisfy the context of the search task. The number of nodes was approximately equal (varying from 330 nodes across conditions by ten or fewer nodes).

All participants were given the same search tasks (24 total), which resembled typical directed browsing tasks that reflected ‘real-world’ hypertext searches. Each task scenario had only one intended target, which was a terminal node. Similar to Norman and Chin (1988), search tasks were both explicit and implicit in nature (12 explicit and 12 implicit tasks).

The hypertext structures consisted of constant, decreasing, increasing, concave, and variable shapes. The depth and breadth of the hypertext conditions varied from a depth of two to six levels and a breadth of two to 27 menu items per node. The hypertext conditions are as follows: (12 x 27), (11 x 5 x 5), (4 x 4 x 4 x 4), (6 x 2 x 2 x 12), (3 x 2 x 2 x 2 x 12), and (2 x 3 x 2 x 3 x 2 x 3). The structural layout of each hypertext condition is presented in Figure 1.

 
(12 x 27) structure
The (12 x 27) structure. Each terminal node contained 27 menu items. (336 total items).

(11 x 5 x 5) structure
The (11 x 5 x 5) structure. Each terminal node contained 5 menu items (341 total items).

(4 x 4 x 4 x 4) structure
The (4 x 4 x 4 x 4) structure. Each terminal node contained 4 menu items (340 total items).

(6 x 2 x 2 x 12) structure
The (6 x 2 x 2 x 12) structure. Each terminal node contained 12 menu items (330 total items).

(3 x 2 x 2 x 2 x 12) structure
The (3 x 2 x 2 x 2 x 12) structure. Each terminal node contains 12 menu items (333 total items).

(2 x 3 x 2 x 3 x 2 x 3) structure
The (2 x 3 x 2 x 3 x 2 x 3) structure. Each terminal node contained 3 menu items (344 total items).

 

Figure 1. The structural representations of each hypertext condition.

Dependent Variables

The dependent variables were search efficiency and search time (time taken to find the correct information). Search efficiency was measured by examining the number of deviations from the optimal path and the number of total back-page presses or ‘commands’ taken to reach the targeted node. The optimal path is the pre-established, shortest route to a specific targeted node that satisfies a search task. Deviations from this path consist of unintended detours, which indicate navigational disorientation. This was measured by comparing the total number of pages accessed in reaching the target node minus the shortest or ‘ideal’ amount of paging needed to reach this node per search task for each respective hypertext condition (total # of pages accessed – total # of paging needed to acquire relevant nodes).

Materials

A Pentium II based PC computer, using a 60 Hz, 96dpi 17-inch high-resolution RGB monitor with a resolution of 1024 x 768 pixels was used. The computer operating system used was Microsoft’s Windows XP. The website was saved locally in order to insure equal download time for all searches.

Procedure

Participants were assigned to one of six hypertext conditions. Each participant was then given, one at a time, randomly assigned explicit and implicit search tasks. They then searched the hypertext until they found the most appropriate answer to the task statement. To search, participants could use the menu item links located on each respective page, or use the ‘Forward’ or ‘Back’ page button located on the browser’s menu bar. They could also select a ‘homepage’ link that was located at the top-left side of the screen in order to return to the parent page at any time during the search. If a participant selected an item that was not designated as the target, he or she would be informed that the item was ‘incorrect’ and instructed to search again for an item that best satisfies the search task. For each task, participants searched until they found the correct task information, or until the allotted time (5 minutes) expired.

Results

Comparison of Explicit and Implicit Task Types

A 2 x 3 MANOVA was used to examine the implicit and explicit task scores for search time and search efficiency (deviations from optimal path and back-page commands). The results revealed significant differences in performance between two task types in that participants searching with explicit tasks had faster search times and had a higher search performance than when searching with implicit tasks [F (1, 113) = 47.05, p < .001], which is consistent with results of Norman and Chin (1988). The task types did not, however, significantly interact with the three conditions (p = .35). Therefore the three hypertext conditions were examined across both implicit and explicit tasks.

Comparing Hypertext Conditions

Analysis of participants’ search efficiency indicated a significant difference between hypertext conditions for deviations from the optimal navigational path and total amount of back-page commands, respectively [F (5, 114) = 30.08, p < .001; F (5, 114) = 48.61, p < .001], as well as for search time [F (5, 114) = 30.08, p < .001]. Post hoc analysis revealed that the (12 x 27) and (11 x 5 x 5) conditions had significantly fewer deviations from the optimal navigational path and less back-page usage than all but the (6 x 2 x 2 x 12) structure. The (2 x 3 x 2 x 3 x 2 x 3) condition had significantly greater navigational deviations than all other conditions (See Figures 2 and 3).

Analysis of search time also revealed that both the (12 x 27) and the (11 x 5 x 5) conditions had significantly faster search than all but the (6 x 2 x 2 x 12) structure. The (2 x 3 x 2 x 3 x 2 x 3) condition had significantly longer search time than all other conditions (see Figure 4). Together, the measurements of search efficiency and search times suggest that the (12 x 27) and the (11 x 5 x 5) conditions were more informationally accessible than all but the (6 x 2 x 2 x 12) structure.

deviations from optimal task

Figure 2. Deviations from optimal path (per search task)
 

number of back-page commands

Figure 3.  Number of back-page commands (per search task)
 

search time

Figure 4.  Search time (per search task)

Discussion

Predicting search efficiency 

The results of this study paint a more complex picture of hypertext performance than has been previously observed. That is to say, with regard to hypertext structure, depth alone may not be the sole, or even the greatest determinate in predicting search performance. In fact, as this study has shown, the shape of a hypertext structure had at least as much to do with search efficiency than its depth. Indeed, the (4 x 4 x 4 x 4) structure was found to be not only less efficient than hypertext shapes of the same depth (i.e., the (6 x 2 x 2 x 12) structure), but structures that were deeper, such as the (3 x 2 x 2 x 2 x 12) structure. As discussed, much has been said about hypertext depth, in that the greater the depth, the less informationally efficient the structure should be (e.g., Jacko & Salvendy, 1996; Snowberry, et al., 1983). However, what seems to be occurring in this study is that the participants’ search efficiency is at least in part, determined by the properties related to the overall shape of the hypertext structure. These properties, then, act to either help facilitate or impede hypertext efficiency by altering the general complexity of the structure. Accordingly, having an inefficient shape will decrease a hypertext’s search efficiency.

The results of this study also support the findings of Jacko and Salvendy (1996), Snowberry et al. (1983), and others, which state that broader and shallower hypertext structures are, on a whole, generally superior in search efficiency than narrower and deeper structures. Namely, broad categorical groupings, which was seen with the (12 x 27) condition, and to a lesser extent the (11 x 5 x 5) condition, facilitated participant navigation that had fewer deviations from the optimum path and back-page commands, as well as faster search times than the more narrow and deeper structures.

Hypertext Shape Efficiency

Moreover, together with the findings of Norman and Chin (1988), it is asserted that concave shapes (i.e., (6 x 2 x 2 x 12)) are more navigationally efficient than relatively constant shapes (i.e., (4 x 4 x 4 x 4)) of the same size and depth. Norman and Chin argue that the concave structure is an optimal design because having a larger percentage—and thus more defined—descriptor items at the beginning of a search helps the user form a more exact match between the concept related to the target item and the actual target item itself. At the terminal level, broad menus reduce the overall information uncertainty, since at this level the target items are more explicitly defined. This notion is also associated with the concept of information scent. That is, the more explicit the association between the initial, descriptor items and the targeted, terminal items, the greater the scent. For this reason, a search at the terminal level should benefit from a maximum amount of scent.

Conversely, nodes in the middle level of the tree structure are, in part, designed to ultimately direct the user to the terminal level to access the targeted item. It is argued by Norman and Chin (1988) that the breadth of items at this level should be constricted, and limited to only directing the user to the appropriate target item. Thus, it is maintained that broad menus at the middle level will only increase the likelihood of users’ choosing the wrong path to the target item. This was found to be also true in the present study.

Implications for Website Design

Typically when individuals evaluate the information accessibility of a website, it mostly involves the appraisal of the site’s physical interface. However, as discussed above, its general shape can play just as, or even more important role in its overall information accessibility—especially for large structures. Furthermore, assessing solely its depth may not provide a truly accurate prediction as to its accessibility. That is to say, an inefficient shape (i.e., a structure with a large percentage of menu choices within the center of the structure) with a relatively shallow depth might be less informationally accessible than a more efficient shape at greater depths. From these results, as well as Norman and Chin’s (1988) study, it is further suggested that websites which have several or more levels of depth attempt to give the user the greatest number of choices at both the top and terminal levels of the site, while constricting the choices between these levels.

1Note: This research is part of a larger dissertation study that examined a metric for predicting the accessibility of information within hypertext structures

References

Jacko, J., A., & Salvendy, G. (1996). Hierarchical menu design: breadth, depth, and task complexity. Perceptual and Motor Skills, 82, 1187-1201.

Kiger, J. I. (1984). The depth/breadth tradeoff in the design of menu-driven interfaces. International Journal of Man-Machine Studies, 20, 201-213.

Larson, K. & Czerwinski, M. (1998). Web page design: Implications of memory, structure and scent from information retrieval. Proceedings of the Association for Computing Machinery’s CHI ’98, 18-23.

Norman, K., L. (1990). The psychology of menu selection: Designing cognitive control of the human/computer interface. Norwood, NJ: Ablex Publishing co.

Norman, K., L. & Chin, J. P. (1988). The effect of tree structure on search performance in a hierarchical menu selection system. Behaviour and Information Technology, 7, 51-65.

Snowberry, K., Parkinson, S., & Sisson, N. (1983). Computer display menus. Ergonomics, 26, 699-712.

Zaphiris, P. (2000). Depth Vs Breadth in the Arrangement of Web Links, Proceedings of the 44th Annual Meeting of the Human Factors and Ergonomics Society, 139-144.

2Note: Michael Bernard is currently a post-doctorate fellow at Sandia National Laboratories


Determining Cognitive Predictors of User Performance within
Complex User Interfaces


By Michael L. Bernard1, Chris Hamblin, & Brett Scofield

It is quite apparent that the computer is now a ubiquitous tool for both home and work. It is also evident that computer interfaces have, in general, acquired more functions and present more information than at any other time in the past. For example, it is now common for users to have layouts with multiple and distinct information sources that are concurrently located on an interface. In light of this, the need to study the interaction between complex interfaces and the cognitive factors that affect user performance has become paramount. Unfortunately, most of the studies that have measured human performance with regard to interfaces were done prior to the advent of modern graphical user interfaces.

For instance, Vincente, Hayes, and Williges (1987) examined participants’ performance while they searched for information within a hierarchical file system. The file system consisted of three levels, with a total of 15 files. Vincente et al. found that performance was affected by several cognitive variables that were independent of computer experience. These variables consisted of spatial ability, which generally predicted search performance, and verbal ability, which predicted their performance when reading was required. In fact, participants with low spatial ability took twice as long to find information than those with high spatial ability. However in the Vincente et al. study, the interface presented only a small number of functions and, thus, it is argued here that the interface used in the study does not reflect the type of complex user interface that is seen today.

The term ‘complex’ in this instance refers to the extent to which an interface adheres to a predictable visual scheme, such that the less predictable the interface layout, the more complex it should be (Tullis, 1983). Another factor that often adds to layout complexity is the overall density of the information displayed. That is to say, the more bits of information that are presented beyond a moderate amount, the more complex the interface is thought to be (Vitz, 1966). Since complex interfaces require users to exert higher amounts of cognitive effort, it is suggested that understanding which cognitive factors most affect users’ actual and perceived performance should help designers create interfaces that conform more closely to the cognitive aptitude of individuals.

This study sought to assess the full extent of intellectual functioning across participants by administering the Wechsler Abbreviated Scale of Intelligence (WASI), along with administering a mirror-tracing task to assess their perceptual-motor skills. The participants’ scores on the individual factors of intelligence and perceptual-motor skills were then examined in relation to their search performance on a complex website interface.

Method

Participants

Twenty-two undergraduate students between 18 and 34 (mean = 22) years of age volunteered to participate in this study. All participants reported using the Web at least once per month (87 percent reported using the Web a few times per week or more). Two of the participants reported visiting a financial/brokerage website a few times per month and two reported visiting a financial/brokerage website less than once per month.

Experimental Task

Participants were presented with four interfaces that had approximately the same amount of layout complexity. This was accomplished by creating layouts with multiple information sources that were densely displayed (see Figure 1 for an example of one of the interfaces). Four interfaces were chosen rather than one in order to present as much information as possible without the need for participants to scroll in order to view the entire interface layout. All of the interfaces were as presented as an online brokerage portal.

Participants were given 20 search tasks and were required to find specific information pertaining to brokerage information that would most appropriately satisfy the context of the search task. For example, one question asked, “What is the pre-market percent change for Adobe Systems Inc.?” The tasks were designed to be moderately difficult. Yet, since the tasks only required participants to find but not interpret this information, they presented the same degree of difficultly for those with or without website brokerage experience. Also, all task information was presented without the need to search lower than one level of depth. The search tasks and the searching order within the four layouts were randomized by means of a Latin square design.

example layout for one of the presented interfaces

Figure 1.  An example of the layout for one of the presented interfaces
 

Materials

A Pentium II based PC computer, using a 60 Hz, 96dpi 17-inch high-resolution RGB monitor with a resolution of 1024 x 768 pixels was used. The computer operating system used was Microsoft’s Windows XP. The format of the text was presented as an HTML web page. Search time was recorded by means of the software tool, Ergobrowser™, which served as the Web browser.  

Dependent Measures

The WASI Instrument

The instrument used to measure the participants’ intellectual abilities was the Wechsler Abbreviated Scale of Intelligence (WASI) instrument (Wechsler, 1999). The WASI is similar in format and highly correlated with the Wechsler Adult Intelligence Scales (Goebel & Satz, 1975; Tulsky & Zhu, 2000). The WASI instrument was used because it is a standardized, normed, and validated short form of the Wechsler Adult Intelligence Scale. It also provided a reliable and valid estimate of verbal, performance, and general intellectual functioning (Kaufman & Kaufman, 2001).

The WASI instrument measures several facets of intelligence, such as verbal knowledge, visual information processing, spatial and nonverbal reasoning, and crystallized and fluid intelligence. The WASI instrument consists of four subtests - Vocabulary, Block Design, Similarities, and Matrix Reasoning (Wechsler, 1999):

  1. Vocabulary measures expressive vocabulary, verbal knowledge, and fund of information. It is also a good measure of crystallized intelligence, as well as general intelligence. For this subtest, participants were required to name pictures and define words that are orally and visually presented.
     

  2. Block Design measures spatial visualization, visual-motor coordination, abstract conceptualization, and perceptual organization by requiring participants to replicate modeled or printed two-dimensional geometric patterns within a specified time by using two-color cube patterns.
     

  3. Similarities measures verbal concept formation, abstract reasoning ability, and general intellectual ability. For this measure, participants were presented four picture items and 22 verbal items. For the picture items, participants were shown a picture of three objects on the top row and four response options on the bottom row. Participants responded by indicating which response item was similar to the three target objects. For the verbal items, a pair of words was presented orally and participants explain the similarity between the object or concept that the two words represented.
     

  4. Matrix Reasoning measures nonverbal fluid reasoning by requiring participants to complete a missing portion of an abstract, gridded pattern by indicating the correct completed pattern from five possible choices.   

These four subtests combine to form the following general scores:

a)   Performance IQ = block design + matrix reasoning

b)   Verbal IQ = vocabulary + similarities

c)   Full Scale IQ = block design + matrix reasoning + vocabulary + similarities

 
The Mirror-tracing Task

In addition to the WASI assessment, a mirror-tracing task was used to measure participants’ perceptual-motor skills. For this measure participants were required to draw a line within and parallel to two printed circles that were 4mm apart. The diameter of the largest printed circle was approximately 180 mm. Participants could not directly view the circles or their hand when drawing the line. Instead, they viewed them through a mirror that reversed the image. Line completion times, which measured time taken to draw the circle, and drawing accuracy, which consisted of the number of time participants drew outside the printed circles, were measured.

Procedure

Participants first performed the mirror-tracing task. They then answered a computer/web experience and computer comfort questionnaire. After completing the questionnaire, participants were instructed to examine the contents of the four layout interfaces until they were familiar with its general layout. This lasted for approximately one minute per interface. Then participants were then given a practice question for each layout. When the practice session was completed, participants were given 20 search questions, five per layout. If a participant selected an item that was not designated as the target, he or she would be informed that the item was ‘incorrect’ and instructed to search again for an item that best satisfied the search task. For each task, participants searched until they found the correct task information, or until the allotted time (5 minutes) expired. After participants finished with the search tasks for all of the interfaces, they answered a perception of disorientation questionnaire. The question items were as follows: "It was difficult to find information on the computer screen," the amount of information presented was about right," and "The placement of information was disorientating."  The questionnaire consisted of a 7-point Likert scale, with 1 = “Not at all” and 7 = “Completely” as anchors. Since the question items were significantly correlated with each other (p < .01), a mean score was used. Participants were then given the WASI intelligence instrument, which took approximately 45-50 minutes to administer, on a subsequent day.

Results

In order to compare search time and perception of disorientation scores to the WASI/Mirror-tracing scores, both the search time and perception of disorientation scores were equally divided into three separate categories, consisting of a fast, medium, and slow for reading speed (see Table 1) and low, medium, and high for the perception of disorientation (see Table 2).

Table 1. Search time categories

 Fastest

 Medium

 Slowest

 < 490.4 sec

   490.4 > < 646.4 sec

  > 646.4 sec<>

 
Table 2.
Perception of disorientation categories

 Low

 Medium

 High

 < 3.67

 3.67 > < 5.00

 > 5.00

 
The three categories served as the independent variables, whereas the WASI and mirror-tracing scores served as the dependent variables. A 3 x 9 MANOVA was used to compare the three levels of search time and perceived disorientation with the dependent scores.

Analysis of the cognitive aptitudes for the three search time categories revealed significant differences between the three search time categories and Block Design, Performance IQ, Full Scale IQ, and Mirror-tracing Accuracy scores [F (7,14)= 3.50, p < .05; see Table 3 below]. Post hoc analysis using the Tukey HSD method indicated that participants with the slowest search time category had significantly lower cognitive aptitude scores than either the top or middle search time categories. The same was essentially true for the Matrix reasoning scores; however, the differences only approached significance. Correlations between search time across all three levels of aptitude and the dependent measures are also presented below. As shown in Table 3, the Block Design subtest correlated the highest with search time.

Search Time

Table 3.  Search time and cognitive aptitude

 Tasks

Correlation

Significance

 Vocabulary 

-0.25

p = .55

 Block Design

-0.69

p < .01

 Similarities

-0.16

p = .34

 Matrix Reasoning

-0.32

p = .07

 Verbal IQ

-0.29

p = .28

 Performance IQ

-0.55

p < .01

 Full Scale IQ

-0.51

p < .05

 Mirror-tracing Accuracy

 0.56

p < .01

 Mirror-tracing Time

 0.46

p = .17

 
Perception of Disorientation

Analysis of participants’ perception of disorientation revealed significant differences between the three levels of perceived disorientation and the Vocabulary and Verbal IQ subtest scores [F (7,15)= 3.12, p < .05; see Table 4 below]. Post hoc analysis indicated that participants with the lowest level of perceived disorientation had significantly higher Vocabulary scores than those with high levels of perceived disorientation. Similar results were found for Full Scale IQ, but these results only approached significance. In addition, participants with the lowest level of perceived disorientation had significantly higher Verbal IQ scores than those with either the middle or highest levels of perceived disorientation. Interestingly, the perception of disorientation was not significantly correlated with search time (p = .61; r = 0.12). The correlations between perceived disorientation across all three levels and the dependent measures are also presented below. As shown in Table 4, the Verbal IQ subtest correlated the highest with perceived disorientation.

Table 4. Perceived disorientation and cognitive aptitude

 Tasks

Correlation

Significance

 Vocabulary 

 0.46

p < .01

 Block Design

 0.00

p = .36

 Similarities

 0.36

p = .29

 Matrix Reasoning

-0.12

p = .17

 Verbal IQ

 0.52

p < .05

 Performance IQ

 0.00

p = .16

 Full Scale IQ

 0.25

p = .08

 Mirror-tracing Accuracy

-0.15

p = .32

 Mirror-tracing Time

-0.06

p = .74

  
Computer/web experience and computer comfort

Assessing participants’ computer and web experience, as well as their comfort with using computers with regard to their search time and perceived perception, not surprisingly, did correlate significantly. Specifically, the participants’ level of comfort with the Internet was significantly correlated with perceived disorientation (p < .05; r = -0.49). Moreover, participants’ indications of frequent web visits significantly correlated with their search time in that frequent web users generally had faster search times (p < .05; r = -0.51).

Discussion

This study has shown that psychometric tests of cognitive abilities can generally predict search performance for complex interfaces, in that certain cognitive factors do significantly correspond to search time performance and perceived disorientation when searching within a complex interface. When considering search time, the subtest factors that most determined participant performance were Block Design, Performance IQ, and Mirror-tracing Accuracy. All of these subtests generally tap into three cognitive functions: spatial visualization, visual-motor perception and coordination, and fluid reasoning. Thus, it is proposed that these cognitive/motor functions play a substantial role in determining search performance within complex interfaces.

The Vocabulary subtest did not significantly contribute to search performance. It is certainly possible that the Vocabulary subtest, which generally measures verbal knowledge, was not as important to performance because the task involved mostly searching for information. The Similarities subtest, which generally measures intellectual ability, also did not significantly contribute to search performance, possibly for the same reason as above.

When assessing perceived disorientation, only the Vocabulary and Verbal IQ subtest factors significantly contributed to participants’ perceived disorientation. Not surprisingly, both of these subtests have a common thread, in that they both measure intellectual ability. It is interesting that perceived disorientation was mostly affected by intellectual factors, rather than spatial factors. It is possible that complex interfaces burden the intellectual capacity of users, which is translated by means of higher correlations with the disorientation scores. Yet, apparently this burden is not great enough to affect the search time of the users.

Implications for Interface Design

It is very common these days to encounter user interfaces that contain multiple information sources that are densely displayed—such as with online travel and brokerage sites. When creating these types of interfaces, designers should take into consideration our cognitive and motor limitations. From these results certain design recommendations can be suggested. Specifically, layouts should be designed to reduce the cognitive burden associated with spatial visualization and visual-motor coordination. To help do this designers should focus their efforts on creating interfaces that appropriately group information by function (Dodson & Shields, 1978) and reduce overall information density to less than 50 percent of the screen area (see Horton, 1989 for a discussion of the empirical studies related to this).

References

Dodson, D. W., & Shields, N. L. (1987). Development of user guidelines for ECAS display design (Vol. 1) (Report No. NASA-CR-150877). Huntsville , AL : Essex Corp.

Goebel, R. A., & Satz, P. (1975). Profile analysis and the Abbreviated Wechsler Adult Intelligence Scale: A multivariate approach. Journal of Consulting and Clinical Psychology, 43, 780-785.

Horton, W. K. (1989). Designing and writing online documentation. New York : John Wiley & Sons.

Kaufman, J. C., & Kaufman, A. S. (2001). Time for the changing of the guard: A farewell to short forms of intelligence tests. Journal of Psychoeducational Assessment, 19, 245-267.

Tullis, T. S. (1983). The formatting of alphanumeric displays: A review and analysis. Human Factors, 25, 657-683.

Tulsky, D. S., & Zhu, J. (2000). Could test length or order affect scores on Letter Number Sequencing of the WAIS-III and WMS-III? Ruling out effects of fatigue. Clinical Neuropsychologist, 14, 474-478.

Vicente, K. J., Hayes, B. C., & Williges, R. C. (1987). Assaying and isolating individual differences in searching a hierarchical file system. Human Factors, 29, 349-359.

Vitz, P. C. (1966). Preference for different amounts of visual complexity. Behavioral Science, 11, 105-114.

Wechsler, D. (1999). Wechsler Abbreviated Scale of Intelligence™ (WASI™). Wechsler Abbreviated™. San Antonio, TX: The Psychological Corporation.

1Note: Michael Bernard is currently a post-doctorate fellow at Sandia National Laboratories


The Effects of Line Length on Children and Adults’
Online Reading Performance

 By Michael Bernard1, Marissa Fernandez, & Spring Hull

Adults, as well as children these days often read an extensive amount of information online. For example, of the 25-to-34-years-old age group, it is reported that 25 percent read online newspapers, compared to only 19 percent who read from printed newspapers (The Digital Edge, 2000). Even young children are now spending progressively more time reading online documents, including being tested online in schools. Thus, the need to address the ergonomic issues associated with this type of medium has become even more important. As discussed in previous editions of Usability News, certain textual factors can affect user performance and preference when reading online text. The purpose of this study was to examine the effects of line length on online reading performance by both adults and children. Unfortunately, little research has been conducted investigating line length and online reading with respect to both actual and perceived reading efficiency, as well as preference; and, to date, no research has included children in its investigation.

Studies investigating line lengths have thus far produced mixed results. For example, Dyson and Kipping (1998) found that longer lines (approximately 75-100 characters per line or CPL) were read faster than very narrow ones (25 CPL), with no difference in perception of reading efficiency. Moreover, Duchnicky and Kolers (1983) found that full-screen (187 mm) line lengths resulted in 28 percent faster reading times over 1/3 screen (62 mm) line lengths. In addition, the full and 2/3 screen (125 mm) line lengths were read significantly faster than the 1/3 screen line lengths. Duchnicky and Kolers concluded that longer line lengths are read more efficiently from computer screens than narrower ones.

Yet, conclusions have mostly favored short to medium line lengths. For example, it has been recommended by researchers that shorter line lengths (about 60 CPL) should be used in place of longer, full-screen lengths, since longer line lengths require greater lateral eye movements, which makes it more likely to lose one's place within the text (Horton, 1989; Mills & Weldon, 1987). Horton (1989) points out that longer line lengths are more tiring to read and recommends limiting line lengths to around 40 to 60 CPL. Huey (1968) generally supports this recommendation by finding that narrower line lengths (approximately 4" or 10 cm) are more accurate on the return sweep than longer line lengths. Gregory and Poulton (1970) maintain that people with poor reading ability performed better when the line length was approximately seven words. This suggests that young readers who have not mastered online reading, as well as readers who have vision deficits, may benefit the most from narrower line lengths. 

Moreover, Youngman and Scharff (1999) found that with 0.5-inch (12.5 cm) margins, the fastest reaction times were for the shorter, 4-inch (10 cm) lengths over the 6- and 8-inch lengths (15 and 20 cm, respectively). The 4-inch lengths were also preferred over the other lengths. With no margin lengths, the 8-inch line lengths had the fastest overall reaction times. Similarly, a recent study by Dyson and Haselgrove (2001) found that medium line lengths (55 CPL, which is approximately 4-inches) facilitated more effective reading at normal reading speed than shorter line lengths (24 CPL).

Method

Participants

Forty participants (20 adults and 20 children) volunteered for this study. The adults ranged in age from 18 to 61, with a mean age of 29 (S.D. = 12 years) and the children ranged in age from 9 to 12, with a mean age of 11 (S.D. = 1 year) and attended 4th, 5th, or 6th grade. All adults reported reading text on computer screens a few times per week or more. Seventy-five percent of the children reported reading text on computer screens a few times per month or more. The children received $5.00 for participating in this experiment. All participants had 20/40 or better unaided or corrected vision as tested by a Snellen near acuity chart.

Materials

A Pentium II based personal computer, with a 60 Hz, 96dpi 17" monitor with a resolution setting of 1024 x 768 pixels was used.

The passages consisted of three line-length conditions. These conditions consisted of passages that had a line length that spread the full distance of the screen, which was 930 pixels (245 mm, 132 CPL; see Figure 1) wide; passages that had a line length of 550 pixels (approximately 145 mm, 76 CPL; see Figure 2); and passages that had a line length of 330 pixels (approximately 85 mm, 45 CPL; see Figure 3). As with typical online passages, the narrower the passage, the more scrolling was required to view the entire passage.
 

Full-width example

Figure 1. Full-length example

 

Medium-width example

Figure 2. Medium-length example

Narrow-width example

Figure 3. Narrow-length example

 
Task Design

Line conditions were compared by having participants read three passages, each with different line lengths. The conditions were counterbalanced by means of a Latin square design. Both the adults' and children’s passages were 12-point Arial, which was black on a white background.

The adults read passages from Microsoft's electronic library, Encarta™, which were written at approximately the same reading level and discussed similar material (all dealt with psychology-related topics). The passages were adjusted to have approximately the same length (an average of 1028 words per passage, S.D. of 18 words)

The children’s passages were short children’s stories drawn from Whootie Owl's Fairytales™, which were written at the 4th and 5th grade reading level. The passages were adjusted to have approximately the same length (an average of 573 words per passage, S.D. of 13 words)

Procedure

Participants were positioned at a distance of approximately 57 cm from the computer screen. They were then asked to read “as quickly and as accurately as possible” the passages, which contained 15 randomly placed substitution words for the adults and 10 for the children (they were not told the number of substitution words). The substitution words were designed to be clearly seen as inappropriate for the context of the passages when read carefully. These words varied grammatically from the original words—for example the noun “cake” being replaced with the adjective “fake.” The participants were instructed to identify these words by stating the substituted words aloud. This was designed to insure that participants actually read the passages, instead of just skimming over them.

To accurately determine font readability and its associated effect on reading time, an effective reading score was used. The score was derived from obtaining the time taken to read the passages divided by the percentage of accurately detected substituted words in the passages—which was registered by a stopwatch.

After reading each passage, participants answered a perception of readability questionnaire. The questionnaire consisted of a 6-point Likert scale with 1 = “Not at all” and 6 = “Completely” as anchors. The questionnaires consisted of statements regarding their ease of reading for each line length condition. When all questionnaires were completed, they ranked the three line length condition for general preference.

Results and Discussion

A within-subjects ANOVA design was used to analyze objective and subjective differences between the line lengths. Post hoc comparisons were done using the Bonferroni test. Ranked font preference was measured by means of a Friedman χ2.

Reading Time and Effective Reading

Examining the mean reading time for each line length surprisingly found no significant differences for both children and adults [p = .40; p = .88, respectively]. It is possible that the benefits of reduced scrolling for the wider condition was offset by its increased line length and, thus, negating any positive effects due to the decrease in its line length. The means and standard deviations for both adults and children for the three conditions are presented in Table 1. Examining the effective reading score (reading time/reading accuracy) also revealed no significant differences in reading time/accuracy between the three line lengths for both children and adults [p = .10; p = .60, respectively, see Table 2].

Table 1. Means and Standard Deviations for reading time

 Means (SD)

 Full-length

 Medium-length

 Narrow-length

 Adults

 370 (107) sec

 363 (103) sec

 266 (109) sec

 Children

 276 (76) sec

 279 (68) sec

 266 (68) sec

 
Table 2. Means and Standard Deviations for effective reading score
 

 Means (SD)