Internet Search Tool Usability Evaluation

 

by

 

Ronald G. Wolak

wolakron@scis.nova.edu

 

 

 

 

 

 

 

 

 

 

 

 

 

A paper submitted in fulfillment of the requirements

for DISS 720 - Assignment Three

 

 

 

 

 

 

School of Computer and Information Sciences

Nova Southeastern University

 

January 2000

 


An Abstract of a Paper Submitted to Nova Southeastern University

in Fulfillment of the Requirements for DISS 720 - Assignment Three

 

 

Internet Search Tool Usability Evaluation

 

by

Ronald G. Wolak

 

January 2000

 

 

The quantity of public information available on the Internet is increasing rapidly. In fact, the Internet is currently a 15 billion word digital library. Searching for information is the primary task of users on the Internet. As a result, new and powerful search tools are being built everyday. These search tools fall into three major classifications: search engines (e.g. Yahoo, Lycos, HotBot, and Excite), metasearch sites (e.g. MetaCrawler, Dogpile, and Inference), and desktop search utilities (e.g. Copernic, InfoSeek Express, and BullsEye). The sizable assortment of Internet search tools present users with the task of choosing the most effective and usable tool. The problem investigated in the following pages was which Internet search tool provided average users the most efficient and usable method to find what they were looking for on the Internet. The goal of the paper was to determine, through research of the current literature and usability testing, which class of tool was the most effective and usable for the average Internet user. In addition, based upon the results of usability testing, recommendations were made for improving the interfaces and functionality of the tools tested.

 


 

Table of Contents

 

 

Abstract   ii

 

Chapters

 

1.  Introduction   1

       Problem Statement and Goal   1

       Relevance   2

       Barriers and Issues   2

       Plan and Approach   3

       Milestones   3

       Summary   4

 

2.  Review of the Literature   5

       Usability Testing   5

       Internet Search Tools   11

       Summary 13

 

3.  Methodology   15

       Research Type   15

       Research Methods Employed   15

       Test Plan   16

       Summary   21

 

4.  Results   22

       Pilot User   22

       User 1   22

       User 2   24

       User 3   25

       Findings   26

 

5.  Summary   27

        Recommendations   27

        Conclusion   28

 

Appendixes

A. HotBot User Interface Screen Print   29

B. HotBot Search Results Screen Print   30

C. Copernic User Interface and Search Results Screen Print   31

D. Usability Study Instructions and Mid-Test Questionnaires   32

E. Post-Test Questionnaire   36

F. Mid-Test Questionnaire Results   40

G. Post-Test Questionnaire Results   43

H. User Profiles   45

I. Think Aloud Observation   46

J. Task Duration   47

K. User Errors   48

 

Reference List   49

 


 

 

 

Chapter 1

Introduction

 

This project report is submitted to fulfill the requirements for DISS 720 - assignment three. The following introduction describes the problem to be investigated, goal to be achieved, and barriers and issues encountered during the completion of the report. The introduction also provides the plan and approach of the project along with a timeline of milestones.

Problem Statement and Goal

The Internet has revolutionized the way we access information. The amount of public information available on the Web is increasing rapidly (Lawrence & Giles, 1999). In fact, the Internet is currently a 15 billion word digital library. Searching for information is the primary task of users on the Internet (Adali, Bufi, & Temtanapat, 1997). As a result, new and powerful search tools are being built everyday. Even though these search tools are more advanced than those available in the past, the average user may not be benefiting fully from these state-of-the-art search technologies.

Internet search tools fall into three major classifications (Sullivan, 1998, September 2): search engines (e.g. Yahoo, Lycos, HotBot, and Excite), metasearch sites (e.g. MetaCrawler, Dogpile, and Inference), and desktop search utilities (e.g. Copernic, InfoSeek Express, and BullsEye). The sizable assortment of Internet search tools present users with the task of choosing the most effective and usable tool. The problem investigated in the following pages was which Internet search tool provides average users the most efficient and usable method to find what they were looking for on the Internet.

The goal of this paper was to determine, through research of the current literature and usability testing, which class of tool was the most effective and usable for the average Internet user. In addition, based upon the results of usability testing, recommendations were made for improving the interfaces and functionality of the tools tested. In order to finish within the given time constraints, only the leading tools from the search engine and desktop search utility classifications were evaluated. These were the HotBot search engine and the Copernic search utility (Lake, 1997). A tool from the metasearch site classification was not included since Copernic duplicated metasearch site functionality among its features.

Relevance

This report is relevant to the study of human-computer interaction. Millions of users worldwide employ Internet search tools at work and home on a daily basis (Shachtman, 1999). Even the best-designed Web sites fall short if users have trouble finding them or navigating their contents. Search tools are typically the best way for users to work around these problems. Determining which is the most usable and effective search tool for the average user would serve to reduce both Internet search time and user frustration.

Barriers and Issues

The primary barrier to the successful completion of this paper was the vast quantity of research material related to Internet search tools. This material needed to be gathered, compiled, filtered, and evaluated to determine its appropriateness to the goal of the paper. Successful project completion was also complicated by the Internet connection problems that occurred during a portion of the usability testing.

Plan and Approach

This project report is a descriptive study formatted in five chapters. The first chapter covers the project's problem statement and goal, relevance, barriers and issues, plan and approach, and milestones and expectations. The second chapter provides a detailed review of the literature relevant to usability testing, and the usability and effectiveness of current Internet search tools.

The third chapter describes the research methods, online tools, and resources that were employed in completing the project report. Included are the strategies for conducting the usability test (i.e. test goals, test methods, identification of subjects, task development, task order and priority, test scenario, and the collection of data).

The fourth chapter begins with a detailed presentation of the usability test results. These results are discussed in relation to the reviewed literature. In addition, unique events that occurred during the testing process are highlighted. Major variables associated with the usability of the two search tools are identified. These include learning, performance, error recovery factors, and user attitude, along with the effort required to complete the task. In the fifth chapter, recommendations for improving the interface and functionality of the two search tools are given.

Milestones

The scope of the project report was manageable and lent itself to investigation within the given time period. The following is a summary of the milestones for the project along with significant dates. The first milestone, determining the problem and goal of the project, was completed on December 11, 1999.

The introduction, chapter one, was completed on December 19, 1999. This was followed by completion of the review of literature, chapter two, on December 28, 1999. Methodology, chapter three, was completed on January 2, 2000, and chapter four was completed shortly thereafter on January 7, 2000. Chapter five was completed on January 8, 2000. After extensive review and proofreading, the project report was submitted on January 9, 2000.

Summary

In summary, the sections provided above introduced the problem to be investigated, the goal to be achieved, and the potential barriers and issues encountered during the completion of the project paper. Also included were the plan and approach for the project along with a timeline of milestones and expectations. In the next chapter, this report provides a review of literature relevant to usability testing and the usability and effectiveness of current Internet search tools.


 

Chapter 2

Review of the Literature

 

The literature review that follows is divided into two major sections. The first section reviews literature related to the performance of a usability test. The second section investigates literature pertinent to the usability and effectiveness of current Internet search tools. A review of the literature applicable to these subjects was critical in achieving the project's goal of determining which class of search tool was the most effective and usable for the average Internet user.

Usability Testing

            Usability testing focuses on whether a user interface is easy to learn, satisfying to use, and has the functionality that users want (Branaghan, 1999). Its goal is to show the designer how a product might be improved. In traditional usability tests, the tester observes users as they use a product to perform tasks (e.g. locate information on the Internet). During the testing, the evaluator collects both quantitative and qualitative data describing user performance and satisfaction. Usability testing also provides the researcher the opportunity to “pick the user’s brain” by asking follow-up questions.

Jordan

            In a recent text, Jordan described how to conduct a usability evaluation (Jordan, 1998). The following were the major topics he covered:

·        Purpose of evaluation

·        Selecting participants

·        Quantitative and qualitative data

·        Constraints and opportunities

·        Reporting the evaluation

According to Jordan, evaluating an existing product has advantages. Products that have been on the market for a while have an experienced user base. These users are able to report on the positive and negative aspects of using a product in its real context of use (Jordan, 1998).

            During the process of selecting participants, evaluators often involve participants who have never used the product (Jordan, 1998). In this situation, evaluators must make some sort of prediction as to the types of problems that might occur. As a result, there is a danger of the evaluation context becoming artificial.

            Jordan continued with a description of the two types of data that can be elicited from a usability evaluation: quantitative data and qualitative data. Quantitative data is useful in situations where a design decision has to be made and a number of possible solutions are being considered (Jordan, 1998). Qualitative data is also useful for a number of reasons. First, it can be used as an approximation of quantitative data when making a "first pass" at addressing an issue. Next and more important, qualitative data can be used to diagnose usability faults and prescribe solutions.

            Although an evaluator may have a clear idea of the ideal approach to take in a usability evaluation, there will always be constraints that dictate a more realistic approach. Examples, provided by Jordan, of these constraints were time deadlines (i.e. investigator and participant time), available money, investigator knowledge, available participants, and limited facilities and resources.

            Jordan also discussed the fact that once a study is completed, it is necessary to report the outcome. The audience typically consists of product managers, designers, engineers, and marketing personnel (Jordan, 1998). The way in which a study is reported is very important. First, it must be clear and persuasive - otherwise it will not be understood or utilized. In addition, a poor reporting style can create hurt feelings and breed hostility. As a result, future usability recommendations are less likely to be implemented.

Rubin

            In another text, Rubin defines usability testing as a process that employs participants who are representative of the target population to evaluate the degree to which a product meets specific usability criteria (Rubin, 1994). Rubin's text emphasized the more informal, less complex testing methods designed for quick turnaround of results in industrial product development environments. The following six stages of a usability test were described in detail in the text:

·        Developing the test plan

·        Selecting and acquiring participants

·        Preparing test materials

·        Conducting the test

·        Debriefing the participant

·        Transforming data into findings and recommendations

Rubin described the test plan as the foundation for the entire test. It addresses, the how, when, where, who, why, and what of a usability test. The test plan serves as the blueprint for the test. It is also the main communication vehicle between the developer, test monitor, and the rest of the development team.

            Another important element of the testing process is the selection of participants whose background and abilities are representative of the product's intended end user group (Rubin, 1994). If the wrong people are tested, test results will be questionable and of limited value. According to Rubin, user profiles should include a description of the most crucial skills, knowledge, demographic information, and other relevant factors required of the typical user of the product.

            Next, Rubin discussed how the most labor intensive aspect of conducting a usability test was developing the test materials that were used to communicate with the participants, collect the data, and satisfy legal requirements (Rubin, 1994). The test materials described were:

·        Screening questionnaire

·        Orientation script

·        Background questionnaire

·        Data collection instruments

·        Nondisclosure agreement

·        Pretest questionnaire

·        Task scenarios

·        Prerequisite training materials

·        Post-test questionnaire

·        Debriefing topics guide

In the guidelines for conducting a test, Rubin also provided the following test monitoring guidelines:

·        Monitor the session impartially

·        Be aware of the effects of your voice and body language

·        Treat each participant as an individual

·        Do not rescue participants when they struggle

·        If you make a mistake, continue on

·        Make sure the participants are really finished before going on

·        Use humor to relax

·        If appropriate, use the "Thinking Aloud" technique

Before "going live" and conducting the test, Rubin recommended a couple of quick checks. First, take the test yourself and look for design flaws (Rubin, 1994). Next, conduct a pilot test using a person who is typical of the product's target audience.

In a discussion on debriefing, Rubin discussed how debriefing participants was often considered less important than other stages of testing process (Rubin, 1994). He emphasized that, more often than not, the debriefing session was the key to understanding how to fix the problems uncovered during the performance segment of the test.

The process of transforming data into findings and recommendations is comprised of four major steps according to Rubin (Rubin, 1994). Those steps are:

1.      Compile and summarize data

2.      Analyze data

3.      Develop recommendations

4.      Produce a final report

In addition to a final report, Rubin recommended creating a presentation of the findings, especially if the test is a part of an overall usability program.

Nielsen

            Nielsen also discussed the topic of usability testing (Nielson, 1993). In the text Usability Engineering, he stressed the importance of paying attention to the issues of reliability and validity. Reliability is a problem in usability testing due to the individual differences in test participants (Nielson, 1993). Validity becomes an issue for a number of reasons. Typical validity problems are using the wrong users, giving the wrong task, and not including time constraints or social influences. Nielsen went on to discuss the following aspects of usability testing:

·        Test Goals and Plans

·        Getting Test Users

·        Choosing Experimenters

·        Ethical Aspects

·        Test Tasks

·        Stages of a Test

·        Performance Measurement

·        Thinking Aloud

·        Usability Laboratories

Included was a discussion of the importance of conducting a pilot test. Nielsen recommended that pilot test participants be chosen from those conveniently available to the experimenter (Nielson, 1993). Also discussed was inclusion of novice users as part of the main test group. As a result, it is often necessary to train users with respect to aspects of a user interface that are both unfamiliar to them and not relevant to the main usability test.

Internet Search Tools

            Research into technology to search the Web is plentiful. In fact, the existence of full-text search engines is one of the major differences between the Web and previous means of accessing information (Lawrence & Giles, 1999). Three major classes of search tools are available to the average Internet user (Sullivan, 1998, September 2): search engines (e.g. Yahoo, Lycos, HotBot, and Excite), metasearch sites (e.g. MetaCrawler, Dogpile, and Inference), and desktop search utilities (e.g. Copernic, InfoSeek Express, and BullsEye).

Search Engines

            Internet search engines search the Web looking for pages that meet the criteria of an inquiry. Although they are relatively effective, the most common complaints about these search tools are the return of too many pages, and the irrelevancy of those pages (Lawrence & Giles, 1999). These problems stem from the fact that search engines do not rank the relevance of results very well. In a recent article, Lawrence and Giles described new search engine technologies that were addressing this.

Research search engines, Google and LASER, promise improved results ranking by making greater use of HTML structure and the graph formed by hyperlinks in order to determine page relevancy. Google uses a ranking algorithm called PageRank (Brin & Page, 1998). PageRank interactively uses information from the number of pages pointing to each page. Google also uses the text in links to a page as a page descriptor - instead of the actual page text.

Another novel search technology is employed by Direct Hit. Direct Hit ranks results for a query according to the number of times previous users have clicked on the pages.         For example, when one of the major search engines returns more than 160,000 possible Web sites for the search request "Boston Car Dealerships", it is impossible for a person to look at all theses sites. However, Direct Hit working together with major search engines, metasearch sites, and desktop search utilities (e.g. Lycos, MSN, HotBot, LookSmart, Infoseek Express, etc.) is able to dramatically improve this search result. The Direct Hit system keeps track of the sites that people actually select from search results lists. By analyzing the activity of millions of previous Internet searches, it is able to significantly narrow down a results list. In an evaluation by conducted by Matthew Lake, HotBot (a Direct Hit partner) was rated the best all-around search engine. Runners up included AltaVista, Excite, and Infoseek (Lake, 1997).

Metasearch Sites

Metasearch sites were developed in response to the increasing availability of conventional search engines and to solve the problem of knowing which one to use (Dreilinger & Howe, 1997, July). Selberg and Etzioni explained that metasearch sites are tools that automatically and simultaneously query several Internet search engines (Selberg & Etzioni, 1995). These sites also interpret the results and display them in a uniform format. The primary advantage of metasearch sites over conventional search engines is their ability to combine the results of multiple search engines and the ability to provide a consistent user interface for searching these engines.

            However, Selberg and Etzioni, go on to explain that metasearch sites introduce their own deficiencies. For example, they have difficulty ranking the list of results. If one engine returns many low-relevance documents, these documents may make it more difficult to find pages that are more relevant. In addition, most metasearch sites limit the number of results that can be obtained. Also, they do not support all the query language features of each specific engine. Another limitation is that metasearch sites only spend a short time in each database and often retrieve only 10 percent of the available results (Barker, 1999).

Desktop Search Utilities

            Desktop search utilities were developed to overcome the deficiencies of the search engines and metasearch sites described above. In a recent article, Shachtman discussed the growing use of the tools in this classification (Shachtman, 1999). The top-rated desktop search utility is currently Copernic 2000. Copernic provides access to 55 information sources (e.g. AltaVista, Deja.com, Excite, HotBot, Infoseek, Lycos, Magellan, WebCrawler, and Yahoo!). Usability features include a search wizard, keyword highlighting in results, a detailed search history, and automatic software updating.

            Copernic's other search management functions include a relevancy score for each result and the removal of duplicates. It also removes dead links and downloads documents for off-line viewing. Results can to be sorted by relevancy, title, address, or search engine. Other useful features include the ability to export and save search results in various formats (HTML, XML, text, and DBF). Copernic also integrates tightly with Microsoft Internet Explorer.

Summary

The literature review given above was divided into two major sections. The first section reviewed literature related to the performance of a usability test. The second section investigated literature pertinent to the usability and effectiveness of current Internet search tools. The following chapter describes the research methods, online tools, and resources that were employed in completing the project report. Included is the test plan for conducting the usability testing (i.e. purpose, user profile, methodology, task list, test environment and equipment requirements, evaluator role, and evaluation measures).


 

Chapter 3

Methodology

 

Research Type

This project paper is a research based descriptive study. The key outcome of the investigation was the determination, through review of the current literature and usability testing, which class of search tool was the most effective and usable for the average Internet user.

Research Methods Employed

Two research methods were employed during the course of this project. The first involved searching online electronic resources to locate relevant literature. The literature located and reviewed included textbooks, conference proceedings, white papers, Web site reviews, and journal and magazine articles. Relevant texts were located, ordered, and delivered using the Amazon.com Internet site. Full text journal and magazine articles, conference proceedings, and white papers were also located and downloaded from a collection of online electronic resources.

A number of online electronic resources were used to locate and download literature described above. These resources included ACM Search (www.acm.org/dl/Search.html), Electric Library (www.elibrary.com), Gartner Group (www.gartner.com), IEEE (www.ieee.org/web/search/), and ProQuest Direct (proquest.umi.com). Perhaps the most powerful search tool to be employed during the course of the project was the desktop search utility, Copernic 2000.

Copernic is a well-documented freeware search agent. It uses predefined channel sets, which allow researchers to target inquiries to all major Web search engines, search for relevant text in newsgroups, and access popular e-mail directories to find people (Copernic, 1999). Copernic conducts fast, multithreaded, full Boolean searches with progress displays and customizable search depth. Once results are compiled, Copernic displays results (including name, location, and introductory text) in a right-click-enhanced list box sorted by relevance.

The second research method employed during this project was conducting usability tests of two leading Internet search tools: HotBot and Copernic. The purpose of the testing was to determine which of the two tools was the most effective and usable for the average Internet user - their target audience. The following section describes in detail the methodology used to conduct the usability testing. In line with Rubin's six stages for conducting a usability test (Rubin, 1994), the testing began with the development of a test plan.

Test Plan

            The following is the test plan used to conduct the usability test of the Internet search tools - HotBot and Copernic. The plan covers the following sections:

·        Purpose

·        User Profile

·        Methodology

·        Task list

·        Test environment and equipment requirements

·        Evaluator role

·        Evaluation measures

Purpose

            The main purpose of the test was to determine which Internet search tool (i.e. HotBot or Copernic) was the most effective and usable for the average Internet user. The test measured the time it took to complete the assigned tasks. In addition, it identified the errors and difficulties involved in using the two search tools to search specific topics over the Internet.

User Profile

            Four users were tested on December 24, 1999 at a private residence in Highland Village, Texas. All participants were tested separately, and the results of the first user were used as a pilot to correct test deficiencies. The pilot user (an HCI student) was selected because of her availability and understanding of the usability testing process. As recommended by Rubin (Rubin, 1994), the evaluator was the first to take the test. He was followed by the pilot user. Deficiencies (i.e. duplicate test questions, spelling and grammatical errors, and formatting problems) were corrected before testing the main user group.

The three users in the main group (all members of the evaluator's immediate family) were chosen for both their experience and lack of experience in searching the Internet. This was in line with Nielsen's recommendation that most user interfaces need to be tested with both novice and experienced users (Nielson, 1993).

Methodology

            The usability test consisted of a performance test that was designed to gather both quantitative and qualitative usability data via paper-and-pencil questionnaires, direct observation, and the thinking aloud technique. The performance test consisted of the following four sections:

1. Participant greeting and verbal background questionnaire

Each participant was personally greeted by the evaluator and made to feel comfortable and relaxed. The participants were then asked to review the first two sections of the Usability Study Instructions and Mid-Test Questionnaires document shown in Appendix D. This included a brief overview of the study as well as a statement of informed consent to be signed.

2. Orientation

Next, each participant received a short, verbal orientation to the test. Background information was gathered at this time. Also, in line with Rubin's recommendations (Rubin, 1994), each user was assured that the search tools were the center of the evaluation and not themselves.

3. Performance test

The performance test consisted of a series of tasks that the participants were asked to carry out while being observed. A time limit of ten minutes was placed on each task. The scenario was as follows:

·        After the orientation, participants were asked to read the first task (see Appendix D), execute it, and fill out the mid-test questions that applied to it (see Appendix D). This was repeated until all six tasks were completed. Each of the tasks required the user to search for a topic on the Internet using each of the two search tools. During the performance test, elapsed time was recorded at the end of each task when the users saved their best answer (i.e. Web page showing the requested information) to disk. This provided an accurate date and time stamp for the event without having to stand over the user with a stopwatch.

·        As recommended by Rubin (Rubin, 1994) and Nielsen (Nielson, 1993), the thinking aloud technique was employed throughout the execution of the performance testing. The thinking aloud technique required users to talk during the test, telling the evaluator what they were thinking as they searched the Internet for the requested information. The purpose of this procedure was to help the evaluator understand what was going on inside the user's mind (Branaghan, 1999). Since this is not a common thing for people to do, the evaluator reminded users several times to continue to think aloud.

4. Participant debriefing

After all the tasks were completed or the time had expired, each participant, as recommended by Rubin (Rubin, 1994), was debriefed by the evaluator. The debriefing included filing out the Post-Test Questionnaire (see Appendix E) and listening to the user's overall comments about his or her performance. The debriefing session served two important functions. First, it allowed the participants to say whatever they wanted, which was important because some of the tasks were frustrating. In addition, it provided important information about each participant's rationale for performing specific actions.

Task List

            The six tasks detailed in Appendix D were selected for a number of reasons. First, the time to complete all six needed to be less than 30 minutes. As recommended by Nielsen (Nielson, 1993), the tasks were small enough to complete within the time allotted, but were not so small as to be trivial. In addition, the tasks were used to increase the user's confidence (i.e. the easiest first and the hardest last).

Test Environment and Equipment Requirements

            The test environment that was provided to each participant consisted of the following:

  1. Quiet area - located in the residence's dining room
  2. Large work surface - the dining room table
  3. Laptop computer with a wheel-mouse, full keyboard, and large display
  4. Internet Explorer 5.0 browser software
  5. Copernic 2000 Pro search utility
  6. Pencil and paper
  7. Usability Study Instructions and Mid-Test Questionnaires document
  8. Dial-up Internet connection using NetZero.net

Evaluator Role

            During the performance test, the evaluator sat in the room with each participant. As suggested by Rubin (Rubin, 1994), participants were not rescued when having difficulty. Users were encouraged to think aloud and prompted to move on when a task's time limit had expired. In addition, the evaluator recorded errors, verbal comments, and each user's general disposition.

Evaluation Measures

            The following evaluation measures were collected:

1.      User profiles

2.      Total time to complete each task and each grouping of tasks

3.      Think aloud observations

4.      Quantity of user errors

Summary

Chapter 3 was divided into three major sections. The first section described the research type of the project. The second section discussed the two research methods employed: online resource search and usability testing. The following chapter presents the results of the usability test.

 


 

Chapter 4

Results

 

The results of the usability test are presented in the following pages. These include sections that detail results in the following seven areas: profile, mid-test questionnaire results, post-test questionnaire results, think aloud observations, task duration, user errors, and unique events.

Pilot User

            Results from the pilot user test were not used to determine which was the most effective and usable Internet search tool. However, feedback from the pilot user enhanced the testing process by correcting numerous spelling, grammar, and format issues. In addition, the impact on this user of an unreliable Internet connection led to the correction of the problem.

User 1

Profile

            User 1 was a 37 year-old woman with a high school education (see Appendix H). She had four years of Internet experience and had designed her own Internet site. In addition, she was very familiar with both search tools - having used each for at least six months. She had a definite predilection for Copernic and its ability to save time and find relevant results. As Jordan pointed out (Jordan, 1998), her familiarity with these existing products allowed her to report on the positive and negative aspects of using the products in real life.

Mid-Test Questionnaire Results

            The results of these questions (see Appendix F) indicated that the user was very satisfied with Copernic and dissatisfied with HotBot. Copernic produced relevant results and HotBot did not.

Post-Test Questionnaire Results

            The post-test questionnaire results (see Appendix G) confirmed the results of the mid-test questionnaire. User 1 found that Copernic excelled in virtually all of Nielsen's ten usability heuristics (Nielsen, 1994) while HotBot did not.

Think Aloud Observations

            User 1's think aloud observations also confirmed the results of the questionnaires. The user found Copernic's user interface (see Appendix C) to be more complicated than HotBot's simple search box (see Appendix A). However, HotBot's irrelevant search results (see Appendix B) validated the need for Copernic's additional functionality.

Task Duration

            The task duration data for User 1 (see Appendix J) indicated that Copernic took significantly less time (12 minutes for Copernic versus 17 minutes for HotBot).

User Errors

            User 1's error rate with Copernic was less than half the rate when using HotBot (see Appendix K).

Unique Events

            User 1 became frustrated by HotBot's lack of relevant results and discontinued searching on two occasions.

Summary

            Test results indicated that Copernic was the most effective and usable search tool. Task durations and errors were low while relevancy was high. These results also confirm Shachtman conclusions about the growing use of desktop search utilities (Shachtman, 1999).

User 2

Profile

            User 2 was a teenager in high school (see Appendix H) with three years of Internet experience. He was unfamiliar with either search tool.

Mid-Test Questionnaire Results

            The results of these questions (see Appendix F) indicated that as the tasks became more difficult the user remained satisfied with Copernic but became unsatisfied with HotBot.

Post-Test Questionnaire Results

            The post-test questionnaire results (see Appendix G) indicated that User 2 found Copernic to be a little slow and more complex for a new user. HotBot was very easy to use and produced results quickly. However, HotBot's results were much less relevant than Copernic's results.

Think Aloud Observations

            User 2's think aloud observations confirmed the results of the questionnaires (see Appendix I).

Task Duration

            The task duration data for User 2 (see Appendix J) contradicted the user's statement that HotBot was "a lot" faster than Copernic. Overall, Copernic took eight minutes and HotBot seven minutes to complete all the tasks. Copernic did take significantly longer to complete each search but made up the time by requiring less total user-initiated searches.

User Errors

            The user produced two errors while using HotBot and zero while using Copernic see Appendix K).

Unique Events

            Despite User 2's unfamiliarity with both search tools, he was able to produce relevant results in a much shorter period than User 1 - the more experienced user.

Summary

            Test results indicated that Copernic was the most effective and usable search tool. Task durations and errors were low while relevancy was high.

User 3

Profile

            User 3 was also teenager in junior high school (see Appendix H) with three years of Internet experience. He was familiar with both search tools.

Mid-Test Questionnaire Results

            The results of these questions (see Appendix F) indicated that as the tasks became more difficult the user remained satisfied with Copernic but became unsatisfied with HotBot.

Post-Test Questionnaire Results

            The post-test questionnaire results (see Appendix G) indicated that User 3 found HotBot's results were much less relevant than Copernic's results.

Think Aloud Observations

            User 3's think aloud observations confirmed the results of the questionnaires (see Appendix I).

Task Duration

            The task duration data for User 3 (see Appendix J) indicated that Copernic took significantly less time (15 minutes for Copernic versus 18 minutes for HotBot).

User Errors

            The user produced five errors while using HotBot and one while using Copernic see Appendix K).

Unique Events

            Neither tool produced satisfactory results on the most difficult task. In addition, User 3 was the only user to take advantage of Copernic's category search capability (e.g. searching using "recipe" specific search engines).

Summary

            Test results indicated that Copernic was the most effective and usable search tool. Task durations and errors were low while relevancy was high.

Findings

            Experience and age played little part in a user's ability to successfully accomplish the search tasks. In addition, Copernic consistently proved that it was the most effective and usable search tool. This became obvious as the search tasks became more difficult. When using Copernic, the error rates and duration times were lower and result relevancy was higher. However, HotBot's user interface was rated higher because of its simplicity.

 


 

Chapter 5

Summary

 

The following sections summarize and conclude this project paper. The first section gives recommendations to improve the search tools based upon the findings just presented. This is followed by a brief conclusion.

Recommendations

Copernic

While Copernic was the obvious winner in this usability face-off, all the users (both novice and experienced) felt that Copernic would be more usable if it were simplified. HotBot excelled in this area. It displayed only a simple search box until the user requested its advanced search capabilities. The popular assumption that the more features a product has, the better it will be - is not true (Berkun, 1999, July/August). Features only improve a product if they are actually used by the user. In most cases, the proliferation of features in products like Copernic creates more complexity than value.

            This report recommends that Copernic eliminate all but the basic search functionality from its standard interface. All other functionality should be made optional for the advanced user that requires it.

HotBot

            The test results confirmed the limitations of HotBot and other Internet search engines (Lawrence & Giles, 1999). Although HotBot is highest rated tool in this classification, the relevancy of its results paled when compared to those of Copernic. HotBot has tried to address this limitation by incorporating Direct Hit technology (Lake, 1997). However, it needs to go further and explore the use of other advanced search engine technologies.

            This report recommends that HotBot explore the use of additional advanced search technologies in its search engine. Advanced algorithms (e.g. PageRank) when coupled with Direct Hit would increase both the effectiveness and usability of the tool.

Conclusion

            In the future, new and powerful search tools will continue to be developed. The most effective and usable of these tools will be simple, fast, and produce relevant results.

As the tools, by necessity, become more complex, their user interfaces must become even simpler.


 

Appendix A

 

HotBot User Interface Screen Print

 

 

 


 

Appendix B

 

HotBot Search Results Screen Print

 

 


 

Appendix C

 

Copernic User Interface and Search Results Screen Print

 

 

 

Appendix D

 

Usability Study Instructions and Mid-Test Questionnaires

 

Overview

 

            The purpose of this usability study is to determine which of two Internet search tools is the most usable and effective. The performance of the applications is being evaluated – not your abilities as a computer user or researcher. The applications that you will be using during the next ten to twenty minutes are HotBot and Copernic.

 

HotBot is an Internet-based search engine that does not require any additional software loaded on your PC. It relies solely on your PC’s Web browser to access the HotBot search engine. HotBot is consistently ranked as one of the top Internet search engines. Information found is current and accurate since HotBot reindexes its approximately 54 million pages every two weeks.

 

Copernic, on the other hand, is a search utility that is loaded on your PC. Copernic has the ability to send queries to multiple search engines (including HotBot). In addition, it is able to sort results in various ways, such as by URL, page title, or search engine. Copernic also offers specialty searches. Music, movies, jobs, recipes, and sports are some of the categories offered.

 

Informed Consent

 

            Please sign and date in the section provided below to indicate your informed consent to be a participant in this usability study. Your signature indicates that you were provided the information required to make an effective decision and that your participation is voluntary and not coerced.

 

            Signature:  _________________________     Date:  _______________

 

Pre-Study Familiarization

 

            Before beginning the study, take the next five to ten minutes to become familiar with HotBot and Copernic. The study monitor will now give a brief demonstration of the user interfaces and features of each application. Also, throughout the testing process, the study monitor will be encouraging you to describe your thought processes and reasoning by talking aloud.

 


Tasks One to Six  (Begin usability study . . . )

 

Task One – Using HotBot, locate the recipe for the Orange Julius drink.

 

When you feel you have located the most relevant answer to this question, please save the Web page to disk using Internet Explorer’s file save as command and answer the following questions:

 

a. How satisfied were you with the presentation format of the results list?

Very Satisfied

Satisfied

No Opinion

Unsatisfied

Very Unsatisfied

 

b. Would you have preferred a different format of the results list?

Yes

No

 

c. How would you rate the relevance of the sources retrieved?

Very Relevant

Relevant

No Opinion

Less Relevant

Not Relevant

 

d. How would you judge the overall performance of the search engine?

Excellent

Good

Satisfying

Poor

Unusable

 

 

Task Two – Using Copernic, locate the recipe for the Orange Julius drink.

 

When you feel you have located the most relevant answer to this question, please save the Web page to disk using Internet Explorer’s file save as command and answer the following questions:

 

a. How satisfied were you with the presentation format of the results list?

Very Satisfied

Satisfied

No Opinion

Unsatisfied

Very Unsatisfied

 

b. Would you have preferred a different format of the results list?

Yes

No

 

c. How would you rate the relevance of the sources retrieved?

Very Relevant

Relevant

No Opinion

Less Relevant

Not Relevant

 

d. How would you judge the overall performance of the search engine?

Excellent

Good

Satisfying

Poor

Unusable

 

 


Task Three – Using HotBot, locate the recipe for the salad dressing at the Olive Garden restaurant.

 

When you feel you have located the most relevant answer to this question, please save the Web page to disk using Internet Explorer’s file save as command and answer the following questions:

 

a. How satisfied were you with the presentation format of the results list?

Very Satisfied

Satisfied

No Opinion

Unsatisfied

Very Unsatisfied

 

b. Would you have preferred a different format of the results list?

Yes

No

 

c. How would you rate the relevance of the sources retrieved?

Very Relevant

Relevant

No Opinion

Less Relevant

Not Relevant

 

d. How would you judge the overall performance of the search engine?

Excellent

Good

Satisfying

Poor

Unusable

 

 

Task Four – Using Copernic, locate the recipe for the salad dressing at the Olive Garden restaurant.

 

When you feel you have located the most relevant answer to this question, please save the Web page to disk using Internet Explorer’s file save as command and answer the following questions:

 

a. How satisfied were you with the presentation format of the results list?

Very Satisfied

Satisfied

No Opinion

Unsatisfied

Very Unsatisfied

 

b. Would you have preferred a different format of the results list?

Yes

No

 

c. How would you rate the relevance of the sources retrieved?

Very Relevant

Relevant

No Opinion

Less Relevant

Not Relevant

 

d. How would you judge the overall performance of the search engine?

Excellent

Good

Satisfying

Poor

Unusable

 

 


Task Five – Using HotBot, locate the answer to the search topic of your choice.

 

When you feel you have located the most relevant answer to this question, please save the Web page to disk using Internet Explorer’s file save as command and answer the following questions:

 

Search Topic:

 

 

a. How satisfied were you with the presentation format of the results list?

Very Satisfied

Satisfied

No Opinion

Unsatisfied

Very Unsatisfied

 

b. Would you have preferred a different format of the results list?

Yes

No

 

c. How would you rate the relevance of the sources retrieved?

Very Relevant

Relevant

No Opinion

Less Relevant

Not Relevant

 

d. How would you judge the overall performance of the search engine?

Excellent

Good

Satisfying

Poor

Unusable

 

 

Task Six – Using Copernic, locate the answer to the same topic selected in task five.

 

When you feel you have located the most relevant answer to this question, please save the Web page to disk using Internet Explorer’s file save as command and answer the following questions:

 

a. How satisfied were you with the presentation format of the results list?

Very Satisfied

Satisfied

No Opinion

Unsatisfied

Very Unsatisfied

 

b. Would you have preferred a different format of the results list?

Yes

No

 

c. How would you rate the relevance of the sources retrieved?

Very Relevant

Relevant

No Opinion

Less Relevant

Not Relevant

 

d. How would you judge the overall performance of the search engine?

Excellent

Good

Satisfying

Poor

Unusable

 

 


 

Appendix E

 

Post-Test Questionnaire

 

 

1. How long have you used the Internet?

1-6 months

6 months to

1 year

1 to 3 years

3 to 5 years

Over 5 years

 

 

2. Which search application presented the most relevant results?

HotBot

Copernic

 

 

3. Which search application was the easiest to use? 

HotBot

Copernic

Why?

 

 

 

4. Which search application would you rate the best?

HotBot

Copernic

Why?

 

 

 

5. Which features of HotBot did you like?

 

 


6. Which features of HotBot did you dislike?

 

 

7. Which features of Copernic did you like?

 

 

8. Which features of Copernic did you dislike?

 

 

9. Please rate your agreement with the statement – “When encountering errors, the site provided good error messages.”

 

  a. HotBot

Completely Agree

1

Agree

2

Neutral

3

Somewhat Disagree

4

Completely Disagree

5

 

  b. Copernic

Completely Agree

1

Agree

2

Neutral

3

Somewhat Disagree

4

Completely Disagree

5

 

 

10. Please rate your agreement with the statement – “The site did not require an extensive use of my memory, I was able to recognize and did not need to recall.”

 

  a. HotBot

Completely Agree

1

Agree

2

Neutral

3

Somewhat Disagree

4

Completely Disagree

5

 

  b. Copernic

Completely Agree

1

Agree

2

Neutral

3

Somewhat Disagree

4

Completely Disagree

5

 

 

11. How would you rate each application’s flexibility and efficiency of use?

 

  a. HotBot

Poor

1

Fair

2

Good

3

Above Average

4

Excellent

5

 

  b. Copernic

Poor

1

Fair

2

Good

3

Above Average

4

Excellent

5

 

 

12. How much extraneous information was given in each application?

 

  a. HotBot

None

1

Very Little

2

More Than a Little

3

A Lot

4

Way Too Much

5

 

  b. Copernic

None

1

Very Little

2

More Than a Little

3

A Lot

4

Way Too Much

5

 

 

13. Overall, how would you rate each application's usability?

 

  a. HotBot

Poor

1

Fair

2

Good

3

Above Average

4

Excellent

5

 

  b. Copernic

Poor

1

Fair

2

Good

3

Above Average

4

Excellent

5

 


14. Please rate your agreement with the statement – “I felt in control while navigating through the application and was able to easily recover when I went to the wrong page.”

 

  a. HotBot

Completely Agree

1

Agree

2

Neutral

3

Somewhat Disagree

4

Completely Disagree

5

 

  b. Copernic

Completely Agree

1

Agree

2

Neutral

3

Somewhat Disagree

4

Completely Disagree

5

 

 

15. Please rate your agreement with the statement – “The words, phrases, and concepts presented on the application were familiar. In other words, the application spoke my language.”

 

  a. HotBot

Completely Agree

1

Agree

2

Neutral

3

Somewhat Disagree

4

Completely Disagree

5

 

  b. Copernic

Completely Agree

1

Agree

2

Neutral

3

Somewhat Disagree

4

Completely Disagree

5

 

 

16. While navigating the site, how often did you know exactly where you were?

 

  a. HotBot

Never

1

Hardly Ever

2

Some of the Time

3

Most of the Time

4

All of the Time

5

 

  b. Copernic

Never

1

Hardly Ever

2

Some of the Time

3

Most of the Time

4

All of the Time

5

 


 

Appendix F

 

Mid-Test Questionnaire Results

 

 

Questions

Pilot User

User 1

User 2

User 3

Task One: HotBot & Orange Julius

 

 

 

 

a. Satisfaction with the presentation of the results list.

Satisfied

Very Unsatisfied

Very Satisfied

Very Satisfied

b. Would you have preferred a different results format?

No

Yes

No

No

c. How relevant were the results?

Very Relevant

Not Relevant

Relevant

Relevant

d. Overall performance of the search tool.

Excellent

Poor

Good

Good

Task Two: Copernic & Orange Julius

 

 

 

 

a. Satisfaction with the presentation of the results list.

Very Satisfied

Very Satisfied

Satisfied

Satisfied

b. Would you have preferred a different results format?

No

No

No

No

c. How relevant were the results?

Very Relevant

Relevant

Very Relevant

Relevant

d. Overall performance of the search tool.

Excellent

Excellent

Good

Good

 


 

Questions

Pilot User

User 1

User 2

User 3

Task Three: HotBot & Olive Garden Dressing

 

 

 

 

a. Satisfaction with the presentation of the results list.

Very Unsatisfied

Very Unsatisfied

Unsatisfied

Very Unsatisfied

b. Would you have preferred a different results format?

Yes

Yes

No

Yes

c. How relevant were the results?

Not Relevant

Less Relevant

Less Relevant

Less Relevant

d. Overall performance of the search tool.

Unusable

Unusable

Poor

Unusable

Task Four: Copernic & Olive Garden Dressing

 

 

 

 

a. Satisfaction with the presentation of the results list.

Very Satisfied

Very Satisfied

Very Satisfied

Satisfied

b. Would you have preferred a different results format?

No

No

No

No

c. How relevant were the results?

Very Relevant

Very Relevant

Very Relevant

Very Relevant

d. Overall performance of the search tool.

Excellent

Excellent

Excellent

Good

 


 

Questions

Pilot User

User 1

User 2

User 3

Task Five: HotBot &

Coconut Macaroons

Vacuum Cleaners

Microsoft Precision Pro Joystick

Songs on the Limp Bizkit CD

a. Satisfaction with the presentation of the results list.

Very Unsatisfied

No Opinion

Very Unsatisfied

Very Unsatisfied

b. Would you have preferred a different results format?

Yes

Yes

No

No

c. How relevant were the results?

Not Relevant

Relevant

Not Relevant

Not Relevant

d. Overall performance of the search tool.

Poor

Satisfying

Unusable

Unusable

Task Six: Copernic &

 

 

 

 

a. Satisfaction with the presentation of the results list.

Satisfied

Very Satisfied

Very Satisfied

Very Unsatisfied

b. Would you have preferred a different results format?

No

No

No

No

c. How relevant were the results?

Relevant

Very Relevant

Very Relevant

Not Relevant

d. Overall performance of the search tool.

Good

Excellent

Excellent

Unusable

 

 


 

Appendix G

 

Post-Test Questionnaire Results

 

 

Question

Pilot User

User 1

User 2

User 3

1. How long have you used the Internet?

5

4

3

3

2. Which search application presented the most relevant results?

Copernic

Copernic

Copernic

Copernic

3. Which search application was the easiest to use? Why?

Both - They were both easy to use. Type in the words and hit search

Copernic - It automates the search of several engines at once

HotBot - faster and didn't have to click Web each time

Copernic -Best ranked results

4. Which search application would you rate the best? Why?

Copernic - Found the most relevant data

Copernic - Overall results were more relevant, also easy to recall previously used searches and modify old ones

Copernic - Very relevant results

Copernic -Best ranked results

5. Which features of HotBot did you like?

The easy way to enter words and click search

Nothing!

It was faster and more user friendly

It sometimes gave the information

6. Which features of HotBot did you dislike?

The order of results and the use of Boolean AND

I didn't find anything

Only one search engine and less relevant results

It showed things I did not want to know

7. Which features of Copernic did you like?

Organization of results - most relevant up top

All - Especially the browse feature for navigating

Many search engines and very relevant results

Best ranked results


 

Question

Pilot User

User 1

User 2

User 3

8.Which features of Copernic did you dislike?

It takes to long to search through the many search engines

Speed

A little slow and more complex for new users

Almost none

9a. HotBot provided good error messaging.

Most of the time

Disagree

Neutral

Neutral

9b.Copernic provided good error messaging.

Agree

Completely

Neutral

Somewhat agree

10a. HotBot did not require an extensive use of memory.

Neutral

Agree

Neutral

Neutral

10b. Copernic did not require an extensive use of memory.

Completely agree

Completely agree

Neutral

Neutral

11a. Rate HotBot's flexibility and efficiency.

Fair

Fair

Above average

Above average

11b. Rate Copernic flexibility and efficiency.

Above average

Excellent

Excellent

Excellent

12a. How much extraneous info did HotBot give?

Way too much

Way too much

More than a little

More than a little

12b. How much extraneous info did Copernic give?

Very little

None

Very little

Very little

13a. Rate HotBot's usability.

Fair

Poor

Above average

Fair

13b. Rate Copernic's usability

Above average

Excellent

Excellent

Above average

14a. I felt in control with HotBot.

Agree

Disagree

Neutral

Agree

14b. I felt in control with Copernic

Agree

Completely agree

Neutral

Neutral

15a. HotBot felt familiar.

Disagree

Disagree

Agree

Neutral

15b. Copernic felt familiar.

Agree

Completely agree

Agree

Neutral

16a. I knew where I was at with HotBot.

Most of the time

Some of the time

Most of the time

Some of the time

16b. I knew where I was at with Copernic.

Most of the time

All of the time

All of the time

Most of the time


 

Appendix H

 

User Profiles

 

 

 

Pilot User

User 1

User 2

User 3

Age

27

37

16

14

Sex

F

F

M

M

Education (yrs.)

18

12

10

8

Internet Experience (yrs.)

5

4

3

3

Familiar with HotBot

Y

N

N

Y

Familiar with Copernic

Y

Y

N

Y

 

 


 

Appendix I

 

Think Aloud Observations

 

 

 

Pilot User

User 1

User 2

User 3

Task 1

Simple to use -quickly found results

Have used HotBot before and liked it

Very pleased with HotBot's speed and results

Not very effective

Task 2

Copernic taking way to long

Like the advanced browser view

Copernic is about the same as HotBot

Like the recipe related search category

Task 3

Linking out of HotBot to find relevant info

Unable to find anything - very frustrated

HotBot is much faster but results are not as good

Not worth using

Task 4

Copernic better with bad Internet connection

Like the forward and backward arrows for viewing results

Copernic did much better this time.

Copernic is much better

Task 5

Kept giving the wrong results - very frustrating

Leaving HotBot to find info

Unable to find anything

This is taking too long - I give up

Task 6

Found what I wanted in spite of misspelling

Difficult search topic - unable to find info

Easily found the info

Unable to find info - topic is difficult

General

User was frustrated by unreliable Internet connection

Using HotBot was less efficient than Copernic

Copernic takes longer but gives better results

Overall Copernic is more effective and wastes less time

 

 


 

Appendix J

 

Task Duration

 

 

(minutes)

Pilot User

User 1

User 2

User 3

Task 1

1

5

2

3

Task 2

7

3

2

2

Task 3

7

8

3

6

Task 4

3

4

2

3

Task 5

3

4

2

9

Task 6

2

5

4

10

Total Time HotBot

11

17

7

18

Total Time Copernic

12

12

8

15

Total Time

23

29

15

33

 


 

Appendix K

 

User Errors

 

 

 

Pilot User

User 1

User 2

User 3

Task 1

0

4

0

1

Task 2

0

2

0

0

Task 3

3

1

1

1

Task 4

0

0

0

0

Task 5

2

2

1

3

Task 6

0

1

0

1

Total Errors HotBot

5

7

2

5

Total Errors Copernic

0

3

0

1

Total Errors

5

10

2

6

 


 

Reference List

 

 

Adali, S., Bufi, C., & Temtanapat, Y. (1997). Integrated search engine. 1997 IEEE Knowledge and Data Engineering Exchange Workshop, IEEE, New York,  pp. 140-147.

 

Barker, J. (1999). Metasearch engines. University of California, Berkley [Online]. Available: http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/MetaSearch.html [1999, December 30].

 

Berkun, S. (1999, July/August). The importance of simplicity [Online]. Available: http://msdn.microsoft.com/library/welcome/dsmsdn/humanfactor8_4.htm [1999, October 30].

 

Branaghan, R. (1999). Testing, one -- two -- three: Fundamentals of usability testing. Fitch [Online]. Available: http://www.branaghan.com/fun_utesting.htm [1999, December 1].

 

Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual Web search engine. 7th International WWW Conference, Brisbane, Australia.

 

Copernic. (1999). Copernic 99 [Online]. Available: http://www.copernic.com [1999, November 7].

 

Dreilinger, D., & Howe, A. (1997, July). Experiences with selecting search engines using metasearch. ACM Transactions on Information Systems, 15(3), 195-222.

 

Jordan, P. (1998). An introduction to usability. Bristol, Pennsylvania: Taylor & Francis.

 

Lake, M. (1997). 2nd annual search engine shoot-out [Online]. Available: http://www.zdnet.com/pccomp/features/excl0997/sear/sear.html [1999, December 21].

 

Lawrence, S., & Giles, L. (1999, January). Searching the Web: General and scientific information access. IEEE Communications Magazine.

 

Nielsen, J. (1994). Heuristic evaluation. In J. Nielsen & R. Mack (Eds.), Usability inspection methods . New York, New York: John Wiley & Sons.

 

Nielson, J. (1993). Usability engineering. San Francisco, CA: Morgan Kaufmann Publishers.

 

Rubin, J. (1994). Handbook of usability testing. New York: John Wiley & Sons, Inc.

 

Selberg, E., & Etzioni, O. (1995). Multi-service search and comparison using the MetaCrawler. 4th International WWW Conference, Boston, Massachusetts.

 

Shachtman, N. (1999, December 6). Tools for Web searches get a new focus. InformationWeek, 118-124.

 

Sullivan, D. (1998, September 2). Search utilities go beyond metasearch. The Search Engine Report.