Internet Search Tool
Usability Evaluation
by
Ronald G. Wolak
wolakron@scis.nova.edu
A paper submitted in fulfillment of the requirements
for DISS 720 - Assignment Three
School of Computer and Information Sciences
Nova Southeastern University
January 2000
An Abstract of a Paper Submitted to Nova Southeastern University
in Fulfillment of the Requirements for DISS 720 - Assignment Three
Internet Search Tool
Usability Evaluation
by
Ronald G. Wolak
January 2000
The quantity of public information available on the Internet is increasing rapidly. In fact, the Internet is currently a 15 billion word digital library. Searching for information is the primary task of users on the Internet. As a result, new and powerful search tools are being built everyday. These search tools fall into three major classifications: search engines (e.g. Yahoo, Lycos, HotBot, and Excite), metasearch sites (e.g. MetaCrawler, Dogpile, and Inference), and desktop search utilities (e.g. Copernic, InfoSeek Express, and BullsEye). The sizable assortment of Internet search tools present users with the task of choosing the most effective and usable tool. The problem investigated in the following pages was which Internet search tool provided average users the most efficient and usable method to find what they were looking for on the Internet. The goal of the paper was to determine, through research of the current literature and usability testing, which class of tool was the most effective and usable for the average Internet user. In addition, based upon the results of usability testing, recommendations were made for improving the interfaces and functionality of the tools tested.
Table of Contents
Abstract ii
Chapters
1. Introduction 1
Problem Statement and Goal 1
Relevance 2
Barriers and Issues 2
Plan and Approach 3
Milestones 3
Summary 4
2. Review of the Literature 5
Usability Testing 5
Internet Search Tools 11
Summary 13
3. Methodology 15
Research Type 15
Research Methods Employed 15
Test Plan 16
Summary 21
4. Results
22
Pilot User 22
User 1 22
User 2 24
User 3 25
Findings 26
5. Summary 27
Recommendations 27
Conclusion 28
Appendixes
A. HotBot User
Interface Screen Print 29
B. HotBot Search
Results Screen Print 30
C. Copernic User
Interface and Search Results Screen Print
31
D. Usability Study
Instructions and Mid-Test Questionnaires
32
E. Post-Test
Questionnaire 36
F. Mid-Test
Questionnaire Results 40
G. Post-Test
Questionnaire Results 43
H. User
Profiles 45
I. Think Aloud
Observation 46
J. Task
Duration 47
K. User Errors 48
Reference List 49
Chapter 1
Introduction
This project report is submitted to fulfill the requirements for DISS 720 - assignment three. The following introduction describes the problem to be investigated, goal to be achieved, and barriers and issues encountered during the completion of the report. The introduction also provides the plan and approach of the project along with a timeline of milestones.
Problem Statement and Goal
The Internet has revolutionized the way we access information. The amount of public information available on the Web is increasing rapidly (Lawrence & Giles, 1999). In fact, the Internet is currently a 15 billion word digital library. Searching for information is the primary task of users on the Internet (Adali, Bufi, & Temtanapat, 1997). As a result, new and powerful search tools are being built everyday. Even though these search tools are more advanced than those available in the past, the average user may not be benefiting fully from these state-of-the-art search technologies.
Internet search tools fall into three major classifications (Sullivan, 1998, September 2): search engines (e.g. Yahoo, Lycos, HotBot, and Excite), metasearch sites (e.g. MetaCrawler, Dogpile, and Inference), and desktop search utilities (e.g. Copernic, InfoSeek Express, and BullsEye). The sizable assortment of Internet search tools present users with the task of choosing the most effective and usable tool. The problem investigated in the following pages was which Internet search tool provides average users the most efficient and usable method to find what they were looking for on the Internet.
The goal of this paper was to determine, through research of the current literature and usability testing, which class of tool was the most effective and usable for the average Internet user. In addition, based upon the results of usability testing, recommendations were made for improving the interfaces and functionality of the tools tested. In order to finish within the given time constraints, only the leading tools from the search engine and desktop search utility classifications were evaluated. These were the HotBot search engine and the Copernic search utility (Lake, 1997). A tool from the metasearch site classification was not included since Copernic duplicated metasearch site functionality among its features.
Relevance
This report is relevant to the study of human-computer interaction. Millions of users worldwide employ Internet search tools at work and home on a daily basis (Shachtman, 1999). Even the best-designed Web sites fall short if users have trouble finding them or navigating their contents. Search tools are typically the best way for users to work around these problems. Determining which is the most usable and effective search tool for the average user would serve to reduce both Internet search time and user frustration.
Barriers and Issues
The primary barrier to the successful completion of this paper was the vast quantity of research material related to Internet search tools. This material needed to be gathered, compiled, filtered, and evaluated to determine its appropriateness to the goal of the paper. Successful project completion was also complicated by the Internet connection problems that occurred during a portion of the usability testing.
Plan and Approach
This project report is a descriptive study formatted in five chapters. The first chapter covers the project's problem statement and goal, relevance, barriers and issues, plan and approach, and milestones and expectations. The second chapter provides a detailed review of the literature relevant to usability testing, and the usability and effectiveness of current Internet search tools.
The third chapter describes the research methods, online tools, and resources that were employed in completing the project report. Included are the strategies for conducting the usability test (i.e. test goals, test methods, identification of subjects, task development, task order and priority, test scenario, and the collection of data).
The fourth chapter begins with a detailed presentation of the usability test results. These results are discussed in relation to the reviewed literature. In addition, unique events that occurred during the testing process are highlighted. Major variables associated with the usability of the two search tools are identified. These include learning, performance, error recovery factors, and user attitude, along with the effort required to complete the task. In the fifth chapter, recommendations for improving the interface and functionality of the two search tools are given.
Milestones
The scope of the project report was manageable and lent itself to investigation within the given time period. The following is a summary of the milestones for the project along with significant dates. The first milestone, determining the problem and goal of the project, was completed on December 11, 1999.
The introduction, chapter one, was completed on December 19, 1999. This was followed by completion of the review of literature, chapter two, on December 28, 1999. Methodology, chapter three, was completed on January 2, 2000, and chapter four was completed shortly thereafter on January 7, 2000. Chapter five was completed on January 8, 2000. After extensive review and proofreading, the project report was submitted on January 9, 2000.
Summary
In summary, the sections provided above introduced the problem to be investigated, the goal to be achieved, and the potential barriers and issues encountered during the completion of the project paper. Also included were the plan and approach for the project along with a timeline of milestones and expectations. In the next chapter, this report provides a review of literature relevant to usability testing and the usability and effectiveness of current Internet search tools.
Chapter 2
Review of the Literature
The literature review that follows is divided into two major sections. The first section reviews literature related to the performance of a usability test. The second section investigates literature pertinent to the usability and effectiveness of current Internet search tools. A review of the literature applicable to these subjects was critical in achieving the project's goal of determining which class of search tool was the most effective and usable for the average Internet user.
Usability Testing
Usability testing focuses on whether a user interface is easy to learn, satisfying to use, and has the functionality that users want (Branaghan, 1999). Its goal is to show the designer how a product might be improved. In traditional usability tests, the tester observes users as they use a product to perform tasks (e.g. locate information on the Internet). During the testing, the evaluator collects both quantitative and qualitative data describing user performance and satisfaction. Usability testing also provides the researcher the opportunity to “pick the user’s brain” by asking follow-up questions.
Jordan
In a recent text, Jordan described how to conduct a usability evaluation (Jordan, 1998). The following were the major topics he covered:
· Purpose of evaluation
· Selecting participants
· Quantitative and qualitative data
· Constraints and opportunities
· Reporting the evaluation
According to Jordan, evaluating an existing product has advantages. Products that have been on the market for a while have an experienced user base. These users are able to report on the positive and negative aspects of using a product in its real context of use (Jordan, 1998).
During the process of selecting participants, evaluators often involve participants who have never used the product (Jordan, 1998). In this situation, evaluators must make some sort of prediction as to the types of problems that might occur. As a result, there is a danger of the evaluation context becoming artificial.
Jordan continued with a description of the two types of data that can be elicited from a usability evaluation: quantitative data and qualitative data. Quantitative data is useful in situations where a design decision has to be made and a number of possible solutions are being considered (Jordan, 1998). Qualitative data is also useful for a number of reasons. First, it can be used as an approximation of quantitative data when making a "first pass" at addressing an issue. Next and more important, qualitative data can be used to diagnose usability faults and prescribe solutions.
Although an evaluator may have a clear idea of the ideal approach to take in a usability evaluation, there will always be constraints that dictate a more realistic approach. Examples, provided by Jordan, of these constraints were time deadlines (i.e. investigator and participant time), available money, investigator knowledge, available participants, and limited facilities and resources.
Jordan also discussed the fact that once a study is completed, it is necessary to report the outcome. The audience typically consists of product managers, designers, engineers, and marketing personnel (Jordan, 1998). The way in which a study is reported is very important. First, it must be clear and persuasive - otherwise it will not be understood or utilized. In addition, a poor reporting style can create hurt feelings and breed hostility. As a result, future usability recommendations are less likely to be implemented.
Rubin
In another text, Rubin defines usability testing as a process that employs participants who are representative of the target population to evaluate the degree to which a product meets specific usability criteria (Rubin, 1994). Rubin's text emphasized the more informal, less complex testing methods designed for quick turnaround of results in industrial product development environments. The following six stages of a usability test were described in detail in the text:
· Developing the test plan
· Selecting and acquiring participants
· Preparing test materials
· Conducting the test
· Debriefing the participant
· Transforming data into findings and recommendations
Rubin described the test plan as the foundation for the entire test. It addresses, the how, when, where, who, why, and what of a usability test. The test plan serves as the blueprint for the test. It is also the main communication vehicle between the developer, test monitor, and the rest of the development team.
Another important element of the testing process is the selection of participants whose background and abilities are representative of the product's intended end user group (Rubin, 1994). If the wrong people are tested, test results will be questionable and of limited value. According to Rubin, user profiles should include a description of the most crucial skills, knowledge, demographic information, and other relevant factors required of the typical user of the product.
Next, Rubin discussed how the most labor intensive aspect of conducting a usability test was developing the test materials that were used to communicate with the participants, collect the data, and satisfy legal requirements (Rubin, 1994). The test materials described were:
· Screening questionnaire
· Orientation script
· Background questionnaire
· Data collection instruments
· Nondisclosure agreement
· Pretest questionnaire
· Task scenarios
· Prerequisite training materials
· Post-test questionnaire
· Debriefing topics guide
In the guidelines for conducting a test, Rubin also provided the following test monitoring guidelines:
· Monitor the session impartially
· Be aware of the effects of your voice and body language
· Treat each participant as an individual
· Do not rescue participants when they struggle
· If you make a mistake, continue on
· Make sure the participants are really finished before going on
· Use humor to relax
· If appropriate, use the "Thinking Aloud" technique
Before "going live" and conducting the test, Rubin recommended a couple of quick checks. First, take the test yourself and look for design flaws (Rubin, 1994). Next, conduct a pilot test using a person who is typical of the product's target audience.
In a discussion on debriefing, Rubin discussed how debriefing participants was often considered less important than other stages of testing process (Rubin, 1994). He emphasized that, more often than not, the debriefing session was the key to understanding how to fix the problems uncovered during the performance segment of the test.
The process of transforming data into findings and recommendations is comprised of four major steps according to Rubin (Rubin, 1994). Those steps are:
1. Compile and summarize data
2. Analyze data
3. Develop recommendations
4. Produce a final report
In addition to a final report, Rubin recommended creating a presentation of the findings, especially if the test is a part of an overall usability program.
Nielsen
Nielsen also discussed the topic of usability testing (Nielson, 1993). In the text Usability Engineering, he stressed the importance of paying attention to the issues of reliability and validity. Reliability is a problem in usability testing due to the individual differences in test participants (Nielson, 1993). Validity becomes an issue for a number of reasons. Typical validity problems are using the wrong users, giving the wrong task, and not including time constraints or social influences. Nielsen went on to discuss the following aspects of usability testing:
· Test Goals and Plans
· Getting Test Users
· Choosing Experimenters
· Ethical Aspects
· Test Tasks
· Stages of a Test
· Performance Measurement
· Thinking Aloud
· Usability Laboratories
Included was a discussion of the importance of conducting a pilot test. Nielsen recommended that pilot test participants be chosen from those conveniently available to the experimenter (Nielson, 1993). Also discussed was inclusion of novice users as part of the main test group. As a result, it is often necessary to train users with respect to aspects of a user interface that are both unfamiliar to them and not relevant to the main usability test.
Internet Search Tools
Research into technology to search the Web is plentiful. In fact, the existence of full-text search engines is one of the major differences between the Web and previous means of accessing information (Lawrence & Giles, 1999). Three major classes of search tools are available to the average Internet user (Sullivan, 1998, September 2): search engines (e.g. Yahoo, Lycos, HotBot, and Excite), metasearch sites (e.g. MetaCrawler, Dogpile, and Inference), and desktop search utilities (e.g. Copernic, InfoSeek Express, and BullsEye).
Search Engines
Internet search engines search the Web looking for pages that meet the criteria of an inquiry. Although they are relatively effective, the most common complaints about these search tools are the return of too many pages, and the irrelevancy of those pages (Lawrence & Giles, 1999). These problems stem from the fact that search engines do not rank the relevance of results very well. In a recent article, Lawrence and Giles described new search engine technologies that were addressing this.
Research search engines, Google and LASER, promise improved results ranking by making greater use of HTML structure and the graph formed by hyperlinks in order to determine page relevancy. Google uses a ranking algorithm called PageRank (Brin & Page, 1998). PageRank interactively uses information from the number of pages pointing to each page. Google also uses the text in links to a page as a page descriptor - instead of the actual page text.
Another novel search technology is employed by Direct Hit. Direct Hit ranks results for a query according to the number of times previous users have clicked on the pages. For example, when one of the major search engines returns more than 160,000 possible Web sites for the search request "Boston Car Dealerships", it is impossible for a person to look at all theses sites. However, Direct Hit working together with major search engines, metasearch sites, and desktop search utilities (e.g. Lycos, MSN, HotBot, LookSmart, Infoseek Express, etc.) is able to dramatically improve this search result. The Direct Hit system keeps track of the sites that people actually select from search results lists. By analyzing the activity of millions of previous Internet searches, it is able to significantly narrow down a results list. In an evaluation by conducted by Matthew Lake, HotBot (a Direct Hit partner) was rated the best all-around search engine. Runners up included AltaVista, Excite, and Infoseek (Lake, 1997).
Metasearch Sites
Metasearch sites were developed in response to the increasing availability of conventional search engines and to solve the problem of knowing which one to use (Dreilinger & Howe, 1997, July). Selberg and Etzioni explained that metasearch sites are tools that automatically and simultaneously query several Internet search engines (Selberg & Etzioni, 1995). These sites also interpret the results and display them in a uniform format. The primary advantage of metasearch sites over conventional search engines is their ability to combine the results of multiple search engines and the ability to provide a consistent user interface for searching these engines.
However, Selberg and Etzioni, go on to explain that metasearch sites introduce their own deficiencies. For example, they have difficulty ranking the list of results. If one engine returns many low-relevance documents, these documents may make it more difficult to find pages that are more relevant. In addition, most metasearch sites limit the number of results that can be obtained. Also, they do not support all the query language features of each specific engine. Another limitation is that metasearch sites only spend a short time in each database and often retrieve only 10 percent of the available results (Barker, 1999).
Desktop Search Utilities
Desktop search utilities were developed to overcome the deficiencies of the search engines and metasearch sites described above. In a recent article, Shachtman discussed the growing use of the tools in this classification (Shachtman, 1999). The top-rated desktop search utility is currently Copernic 2000. Copernic provides access to 55 information sources (e.g. AltaVista, Deja.com, Excite, HotBot, Infoseek, Lycos, Magellan, WebCrawler, and Yahoo!). Usability features include a search wizard, keyword highlighting in results, a detailed search history, and automatic software updating.
Copernic's other search management functions include a relevancy score for each result and the removal of duplicates. It also removes dead links and downloads documents for off-line viewing. Results can to be sorted by relevancy, title, address, or search engine. Other useful features include the ability to export and save search results in various formats (HTML, XML, text, and DBF). Copernic also integrates tightly with Microsoft Internet Explorer.
Summary
The literature review given above was divided into two major sections. The first section reviewed literature related to the performance of a usability test. The second section investigated literature pertinent to the usability and effectiveness of current Internet search tools. The following chapter describes the research methods, online tools, and resources that were employed in completing the project report. Included is the test plan for conducting the usability testing (i.e. purpose, user profile, methodology, task list, test environment and equipment requirements, evaluator role, and evaluation measures).
Chapter 3
Methodology
Research Type
This project paper is a research based descriptive study. The key outcome of the investigation was the determination, through review of the current literature and usability testing, which class of search tool was the most effective and usable for the average Internet user.
Research Methods Employed
Two research methods were employed during the course of this project. The first involved searching online electronic resources to locate relevant literature. The literature located and reviewed included textbooks, conference proceedings, white papers, Web site reviews, and journal and magazine articles. Relevant texts were located, ordered, and delivered using the Amazon.com Internet site. Full text journal and magazine articles, conference proceedings, and white papers were also located and downloaded from a collection of online electronic resources.
A number of online electronic resources were used to locate and download literature described above. These resources included ACM Search (www.acm.org/dl/Search.html), Electric Library (www.elibrary.com), Gartner Group (www.gartner.com), IEEE (www.ieee.org/web/search/), and ProQuest Direct (proquest.umi.com). Perhaps the most powerful search tool to be employed during the course of the project was the desktop search utility, Copernic 2000.
Copernic is a well-documented freeware search agent. It uses predefined channel sets, which allow researchers to target inquiries to all major Web search engines, search for relevant text in newsgroups, and access popular e-mail directories to find people (Copernic, 1999). Copernic conducts fast, multithreaded, full Boolean searches with progress displays and customizable search depth. Once results are compiled, Copernic displays results (including name, location, and introductory text) in a right-click-enhanced list box sorted by relevance.
The second research method employed during this project was conducting usability tests of two leading Internet search tools: HotBot and Copernic. The purpose of the testing was to determine which of the two tools was the most effective and usable for the average Internet user - their target audience. The following section describes in detail the methodology used to conduct the usability testing. In line with Rubin's six stages for conducting a usability test (Rubin, 1994), the testing began with the development of a test plan.
Test Plan
The following is the test plan used to conduct the usability test of the Internet search tools - HotBot and Copernic. The plan covers the following sections:
· Purpose
· User Profile
· Methodology
· Task list
· Test environment and equipment requirements
· Evaluator role
· Evaluation measures
Purpose
The
main purpose of the test was to determine which Internet search tool (i.e.
HotBot or Copernic) was the most effective and usable for the average Internet
user. The test measured the time it took to complete the assigned tasks. In
addition, it identified the errors and difficulties involved in using the two
search tools to search specific topics over the Internet.
User Profile
Four users were tested on December 24, 1999 at a private residence in Highland Village, Texas. All participants were tested separately, and the results of the first user were used as a pilot to correct test deficiencies. The pilot user (an HCI student) was selected because of her availability and understanding of the usability testing process. As recommended by Rubin (Rubin, 1994), t