Advertisement
Articles

Online Databases: Spying on Search Strategies

E-Mail This Link


Enter recipient's e-mail:


Close
Email
Print |
RSS |
Share | |

By Carol Tenopir -- Library Journal, 05/01/2004

Only the most dedicated supersearchers are motivated to learn and control command systems, like DialogClassic, that rely on the user to input complex search strategies. Infrequent searchers and most end users choose interfaces that do some of the work for them and make the search process appear easy. However, the easier a good interface seems to be, the more complex the system underlying it must be.

Google is popular not only because of its simple dialog-box interface but because users are typically satisfied with their results. Google takes simple input and then adds variant word form endings and spell-checks, then ranks sites with a mix of word frequency, word occurrence patterns, and a kind of popularity measure (by factoring in the amount of linking received by retrieved sites). Still, the exact strategy within a Google search remains hidden. It relies on automatic ways to enhance searches. Sometimes human input, behind the scenes, is the best way to build good searches.

Underneath Dialog

Complexity isn't limited to free web search. Commercial online systems that rely on Boolean logic may also put power behind the end user interface, relying both on automatic construction of a Boolean query and a predetermined set of human-generated best search strategies. Dialog's "spy" feature lets expert searchers see the complex search string that lies underneath searches, executed in the easy-to-use form fill-in interfaces of Dialog1 and DialogSelect.

Dialog1 provides a menu of types of searches based on common corporate information needs. For example, on the opening screen, users can select from among nine "channels" of information, e.g., Biotech, Business Intelligence, Pharmaceuticals, and so on. Clicking on one of these channels takes the user to another menu list to refine the search process further. Under Pharmaceuticals, for example, seven options provide choices by form or format, such as Drug Pipeline Summaries, Industry Newsletters, Research Literature, etc. Picking Research Literature (or any other) provides additional choices—by drug and year restrictions, or by therapeutic indication and type of literature, among other areas.

Under the business intelligence channel, another possible path takes you to industry and then market research. The choices made at each step trigger the system to select databases that will be searched and also impose some automatic limiting features, such as date ranging, restricting to document type, and so on. Further refinements are added when the user reaches a specific form fill-in.

In market research, the user can pick an industry from a controlled list; specify a free-text main topic or a topic to be searched in the entire text; or add a company name, publisher name, and country or date range. After filling in the form and hitting return, the simple system seems magically to run a search and return a list of relevant documents.

Human work underneath

Underneath that magic lies a lot of human expertise. Both Dialog1 and DialogSelect are targeted to end users, but corporate librarians make sure the products work before rolling them out. According to Eddie Watkins, director of product development for Dialog, these end user products have started almost a "cottage industry of folks who help us develop scripts." Patent searches, for example, are scripted to replicate the search terms and techniques that customers searched routinely. When these corporate librarians provide a searching tool to their end users, it should retrieve the results as if they were searching on behalf of their users.

Search scripts choose the best databases; add truncation and synonyms; limit searches to specific fields such as trademark name, industry name, or descriptors; and provide pick lists for controlled vocabulary fields. Sometimes the scripts address the limitations in databases. In the pharmaceuticals channel, drug names are matched against drug name indexes to make sure all forms of the name are added. For the novice, this process is quick and easy. But as an expert searcher, I want to know what the system really did.

Spying

After running a search on recent market research information on the U.S. publishing industry, I scrolled all the way to the end of the titles list URL in the browser address bar and entered "&spy." A "spy report" reveals that my simple menu and pick list choices invoked a fairly sophisticated process. The system selected a dozen databases to search together; searched for the controlled term "publishing 'and' media" in the industry field; searched for united state? Or US Or USA in the country name field; added publication years of 2003–2004; and used the Boolean AND to put them all together. Duplicates were removed, the final set was sorted by publication date, and both title and full formats were made available. It was the good search strategy of an experienced Dialog searcher.

Seeing how expert searchers design a search is a great learning tool. It can also be wonderful for troubleshooting when the results don't meet your needs. Students can try their skills at running a topic in DialogClassic then compare it to what the experts do in Dialog1 or DialogSelect. Seemingly simple searches, it turns out, can actually be quite complex. It is up to expert searchers to understand the complexities and ensure that these easy-to-use systems work in the best way.


Author Information
Carol Tenopir (ctenopir@utk.edu) is Professor at the School of Information Sciences, University of Tennessee, Knoxville





 

Welcome the LJ Archives.

This archive site is the home to all LJ articles published prior to January 2012;
Advertisement

LJ Reviews Database

LJ Reviews Center

Latest Stories



From the Blogs



Advertisement

Advertisement

Connect with Library Journal


Follow on Twitter








About Us | Advertising Information | Submissions | Site Map | Contact Us | RSS | Subscriptions
©2011 Media Source, Inc., All rights reserved.
Use of this Web site is subject to its Terms of Use | Privacy Policy
Media Source Inc. Media Source Inc. Media Source Inc. Media Source Inc. Media Source Inc. Media Source Inc.