January 2006

A very common problem in citation search is incorrectly cited references in journal articles, in most cases due to the careless attitudes of authors and even careless journal editors. Librarians often meet carelessly written cited references in daily work and that often makes the retrieval of an article very problematic and ineffective.

But it’s not just a problem when locating items. It’s an even worse problem for the citation databases that use these cited references to count times cited for a journal article. That’s why you should use the Cited reference search when searching all cited references for a journal article in Web of Science.

We checked 2 incorrectly cited references in Web of Science and then checked if it was an author error or a database indexing error by checking up the original journal article. Then we checked if they were corrected in Scopus. Here’s the first example when searching author Aasa A in Web of Science:

The page is incorrect in cited reference 2. It’s page 644 but should be page 664. We checked the citing article as follows:

Checking the same incorrect cited reference in the reference list of the article based on how it’s indexed by Web of Science:

The original article has the incorrect page 644, which should be page 664:

Checking up the same article in Scopus shows the cited reference is correct and must have been corrected by Scopus (or being corrected in the database where the reference is collected from):

Checking up the author Elgh F via Cited Reference Search in Web of Science gives 3 incorrectly cited references with IN PRESS in front of the journal abbreviation:

We checked the two citing articles for one of the incorrectly cited references IN PRESS J VIROL MET:

Checking the cited references of the first citing journal article in Web of Science:

Then we checked the original article and found out that the cited reference is written with the words in press after the journal title J. Virol. Methods. and the in press message should not have been interpreted as a part of the journal title in Web of Science and especially not with in press in front of the journal title.

So let’s check Scopus? It’s correct. In press follows after and separately from the journal title.

As you may also have seen Scopus has 6 cites to the Aasa U article and 33 cites to the Elgh F article. Comparable to Web of Science with 4 cites (5 if you include the incorrect one) to the Aasa U article and 33 cites to the Elgh F article.

Google Scholar went international on Jan 11th when they included the Scandinavian languages Finnish, Swedish, Danish and Norweigan. Gary Price wrote 11 Jan in Search Engine Watch Blog that there are just two languages included but when I check all four languages are covered. Check the following screenshots.

Searching kirjasto on scholar.google.fi:

Searching semantiska webben on scholar.google.se:

Searching uddannelse on scholar.google.dk:

Searching sykepleiere on scholar.google.no:

When you check the screenshot for Swedish Google Scholar (scholar.google.se), you can see the first hit is an article about the semantic web (swe. semantiska webben). I wrote the article for a computer magazine called Datormagazin. It’s not scientific in any aspect and of course not peer-reviewed but it has been cited for example by the master thesis (swe. magisterexamensuppsats) Ontologier i kunskapsorganisation by Irene Granström. Swedish master theses are not considered to be scientific. This is an example of the broad aspect of indexing that for example Peter Jacso criticized in his evaluations of Google Scholar.

The relevance order of the assessed and non-assessed research, low graduate papers , preprint articles and popular science articles etc. is done by the ranking algorithm, but the width of Google Scholar compared to Scopus and Web of Science could be useful if the user’s have the skills to do content assessments.

I also sent some questions to Anurag Acharya:

Lars: Could you give an example on Swedish publishers you work with? (In this case I wanted to know if Google Scolar does cooperate with Swedish publishers publishing works in Swedish.)

Anurag: We are not sharing a list of publishers at this time.

Lars: Did you have any Swedishfluent people you worked with to implement Swedish GS?

Anurag: No. Note however that scholarly articles are remarkably similar in
structure across many languages and most of the issues are common.

Lars: How do you restrict a search to Swedish, Finnish, Danish or Norwegian documents?

Anurag: This is not possible at this time. We may add this in the future.

Lars: Now when Swedish characters å, ä, ö, is implemented will you connect different spellings in author names to same search? Like söderström gives hits also on soederstroem also? It’s not done with rantapää.

Anurag: We have implemented several cases of diacritical normalization. Would appreciate suggestions for others that we may have missed.

Some further questions have not been answered. I will publish them here if I get them answered.

A suprised researcher at my university told my colleague some days ago when searching her name in Google Scholar: I didn’t write that article! Her name is Berit Ardlin and her christian name initials BI. Look at this screenshot from Google Scholar.

When you click the link you get the article, but with other authors. Check the reference in Pubmed for example. So how come? Google Scholar indexes the fulltext of articles (some from the proprietary web, some from the open web) up to a certain limit of KB. The fulltext is often visible in the search results because your search keywords exists in the fulltext. In this case BI Ardlin and the other authors M Braem, B Van Meerbeek, JE Dahl etc just should exist in the fulltext. But checking the fulltext of the article doesn’t give any hits on Ardlin. So where do these author names come from? From other fulltext? Someone with a clue or is it just Anurag Acharya at Google Scholar who has got the answer?

At least the dots before the “false” author names and the dots after (in front of the journal name) indicate it’s from some other text.
This is just an example on how the search results visualization, due to full-text indexing ambitions, sometimes makes it very confusing in Google Scholar.

In issue 3/4, volume 10, 2005 of Internet Reference Services Quarterly there are some articles about Google Scholar which I added to the page References to literature. The whole issue is co-published in the book Libraries and Google, Edited by William Miller and Rita M. Pellen, Haworth Press, 2005.

With the polysearch engine maintained by Peter Jacso, professor at the University of Hawaii, it is possible to do some evaluations to check whether the Google Scholar finds more or fewer hits than the native search engine of the publisher. The publishers you can compare is Annual reviews, Blackwell, Institute of Physics, Nature publ. group, Wiley Interscience. You can perform fulltext search or title search. I would suggest that you use title search because Google Scholar does not index the entire fulltext, just to a certain limit of KB. Which Anurag Acharya of Google Scholar told me in an interview made in September 2005 in Copenhagen.

As Jacso instructs don’t use the search operators beside of proximity search with ” “. I tried some searches which gave me the following results:

Annual Reviews (Native) 2 hits
Annual Reviews (Google Scholar) 6 hits

“frontotemporal dementia”
Blackwell (Native) 24 hits
Blackwell (Google Scholar) 18 hits
Check a screenshot from the search.

“fuzzy logic”
Institute of physics (Native) 8 hits
Institute of Physics (Google Scholar) 7 hits
Nature (Native) 45 hits
Nature (Google Scholar) 23 hits

“semantic web ”
Wiley interscience (Native) 8 hits
Wiley interscience (Google Scholar) 13 hits