Subject search


Web of Science have introduced a feature called Refine your results which is displayed in top of each search. Check this screenshot:

There are several options for refining, for example institutions. When searching for Haglin L as author you get 11 hits and when clicking Institutions 8 alternative addresses:

The problem is:

1) You won’t always see the name of the department in the overview, even if it’s in the record. You have to click the record to view it.

2) You will get all affiliations for all authors of all articles, not just for the author you searched or the first author name. Addresses are mixed together for all authors.
Positive is you get a better overview of different spelled addresses for authors.

Refining option Subject categories is not a subject search limit I recommend. It’s built upon the subject categories of the journals where articles is published. Sometimes you can see an option for Concept Codes or Descriptors or Controlled Index. It’s connected to content from the database BIOSIS and should not be used as an multidisciplinary subject search. These are options in development and will see in future what it can bring.

Conclusion: A better, faster overview, though same problems exist with lacking address/author information.

Google Scholar has released an option for searching Related Articles, similar to PubMed Related Articles and other databases. Google Scholar is not using a thesaurus as PubMed do, but in advanced search you can limit to 7 broad subjects. With these Related Articles option you can try to search an article of your topic. Maybe you know the title or the author. When you find the article click on the link beneath it. Check this article reference by Judit Bar-Ilan about search engines evaluation:

Clicking the Related Articles link returns 99 related articles and one is for example Greg Notess’ old (yes, 2000 is old in the search engine evaluation world!) article about Search engines inconsistencies from the magazine Online.

I also tried some other smaller subjects within information science and it returns remarkably relevant hits. Nice satisfying option, but my brief evaluations doesn’t say anything about coverage of course. When trying to subject search Google Scholar (which as I said is not easy to do comprehensive) try to use the related articles, beside of free-text searching.

It's hard to make an easy and still deeply and thorough evaluation of subject coverage in Scopus, Web of Science and Google Scholar due to a lot of reasons. Especially because the databases in question do not use established thesauri. Though I made a small comparison between these multidisciplinary databases and PubMed.

I chose three MeSH terms (two of them with subheadings included) with three words included. I limited my search to 1996, mainly because Scopus subject coverage before 1996 is selective. The MeSH-terms were:

Hormone replacement therapy
Antifreeze proteins toxicity
Neonatal screening ethics

Result from PubMed searching MeSH database:

Result from Scopus searching field keywords:

Result if broading the search to title, abstracts and keywords.

Result from Web of Science when searching Topic in General search which include title, abstracts and Keywords (author keywords and keywords plus):

As for results in Google Scholar they are more hard to evaluate, because Google Scholar indexes significant parts of the fulltext. It's possible to limit to title search but not abstracts for example. A lot of the material Google Scholar indexes is retrieved from the open web and other material is Journal articles references (and fulltext) from publishers. Google Scholar has not integrated any thesauri for the article references, however. Instead they have 7 subject areas available for limiting in advanced search. As viewed in this screenshot one of the 7 subject areas is Medicine, Pharmacology and Veterinary Medicine. I made a limit to that subject area and timespan 1996-.

2310 hits are definitely more that the others but as you see the second hit is definitely of high relevance but the others have indexed the word ethics in the fulltext where the word ethics is part of a ethics committee and not necessarily relevant.

Screen shot of search on antifreeze proteins toxicity shows 60 hits:

Not all of these hits are relevant and some are hits from books.

Screen shot of search on hormone replacement therapy shows 26.200 hits:

Conclusion: It's not recommendable to use Web of Science, Scopus or Google Scholar when doing exhausitve, specific searches when all possible important records of current science have to be found. This is due to the fact that thesauri and controlled vocabulary are not integrated at all or not properly.

Broadening a subject search in Scopus from searching Keywords to searching Title, abstracts and keywords gives a higher recall but not in all cases relevant records. To broaden a search both Scopus and Google Scholar is recommended but not Web of Science which indexes less material from 1996.

Web of Science has no integration of thesauri in its database. Instead Eugene Garfield and ISI 1990 invented something called Keywords plus which Garfield explains here:

Essays of an Information Scientist: Journalology, KeyWords Plus, and other Essays, Vol:13, p.295, 1990 Current Contents, #32, p.3-7, August 6, 1990 [PDF]

Here's an explanation of how KeyWords plus works:

”KeyWords Plus supplies additional search terms extracted from the titles of articles cited by authors in their bibliographies and footnotes”.

"Records without references won't have KW+ – but more specifically, articles whose references are not linked to source items. In addition, it may be that those with very few linked references won't generate good candidates for KW+ either".

Here's an example of a record retrieved when searching for Hormone replacement therapy. Look at field KeyWords Plus and you find one hit on Hormone replacement therapy in bold.

To check if this KeyWords Plus phrase is extracted from the references list click on Cited references and search for the phrase. In this screen shot you'll see it exists:

Beside of this KeyWords Plus Web of Science also has author keywords.

Conclusion: I haven't evaluated the KeyWords Plus yet, but as it's uncontrolled terms it's impossible to make exhaustive, refined subject searches in Web of Science.

From Elsevier databases Scopus has integrated thesauri like GEOBASE Subject Index (geology, geography, earth and environmental science), EMTREE (life sciences, health), MeSH (life sciences, health), FLX terms and WTA terms (fluid sciences, textile sciences), Regional Index (geology, geography, earth and environmental science), Species Index (biology, life sciences), Ei thesaurus (controlled and uncontrolled terms) (engineering, technology, physical sciences). As you see, the last one includes uncontrolled terms. Scopus also integrate author keywords which are uncontrolled keywords supplied by the author of the article.

When searching the field Keywords in Scopus you won’t get just controlled vocabularies, you will also get uncontrolled vocabulary from Ei and author keywords.

This reference from PubMed:
Nicolau B, Marcenes W, Bartley M, Sheiham A.
Associations between socio-economic circumstances at two stages of life and
adolescents' oral health status.
J Public Health Dent. 2005 Winter;65(1):14-20.

It does not have the MeSH terms (from PubMed) or EMTREE (from Embase) integrated in the Scopus reference. Just author keywords as you see at this screen shot:

Another example of a Scopus record with no EMTREE terms. (No MeSH headings exist yet because it's a PubMed in process record).

And the same reference in Embase with EMTREE terms:

The following reference is from from PubMed:

Anderson C.
Breast cancer. How not to publicize a misconduct finding.
Science. 1994 Mar 25;263(5154):1679.

See the MeSH-terms at the screen shot. The terms with * -sign means it's a major MeSH heading:

Not all major MeSH headings are included in Scopus reference:

Ei thesaursus, sometimes called Compendex thesaurus, is not properly implemented either. On this screen shot you find a record from Scopus with no Ei thesaurs terms attached:

And here's a screen shot from the database Compendex showing the same record with Ei thesaurus terms attached:

When testing and comparing Compendex with Scopus, quite a lot records didn't integrate Ei thesaurus, but when it exists on Scopus records it has integrated both main heading, controlled and uncontrolled terms properly.

Unfortunately, subject search in Scopus has a lot of disadvantages:

  1. It’s impossible to browse the keywords and thesauri integrated in Scopus.
  2. The thesauri are inconsequently integrated, sometimes no MeSH terms, sometimes no Emtree, sometimes not all major MeSH headings.
  3. Uncontrolled terms are mixed with controlled ones and not possible to separate when refining a search.
  4. You can’t choose which thesauri to use.
  5. No mapping of terms as in Embase and PubMed.
  6. No possibility to explode terms.
  7. No integration of MeSH subheadings
    Conclusion: This means Scopus is impossible to use for refined and comprehensive subject search. That means you have to use PubMed to properly use MeSH terms, Embase to properly use Emtree and Compendex to properly use Ei terms. Of course Scopus is not built to substitute the Elsevier databases. That’s why I don’t think Scopus will ever implement the thesauri of the Elsevier databases properly. But why subject search of MeSH terms is not properly implemented when PubMed is a free source is very strange.