Pros and cons of Google Scholar

As we ponder the ethics of Google’s tax minimisation tactics, it’s worth reflecting on how dependent the academic community has become on Google Scholar, now a major gateway through which to access academic publications.

“Pros and cons” can stand for producers and consumers (as well as “for and against”). After all, academic researchers are in the business of producing as well as consuming material available through specialised web search engines such as Google Scholar (GS). I don’t think I’m alone in thinking that GS really is changing research and writing practices.

Consumer-side

On the consumer side, GS is a citation index, enabling anyone on line to access books and articles, and to see who subsequently cited each of those publications.

Like an archway or niche set into a wall, depicting books with water coming out of them. There's a stags head in the middle of the niche. It's all of marble set into a brick wall, and it's a bit weathered.

So researchers can browse through threads of related publications. They can also see how often an article is cited, which  provides clues to its significance in the field.

No doubt there are traps, particularly for students, but here’s a potentially positive game changer from the consumption side.

Leaky boundaries

You access GS through search words and don’t need to restrict your search to a particular subject area. So if you look up the word “melancholy” you’ll get references from literature, psychology, philosophy, art and politics, just on the first few pages. That’s got to improve communication between disciplines, shake up disciplinary differences, and may even reconfigure boundaries between disciplines. Something has to leak through.

I’m no longer surprised if a student essay on architecture references articles from media studies, psychology, engineering, and/or education. By most accounts good research is about making connections, and in the right hands GS operates as a kind of creative connection machine … more than a confusion machine.

Supply-side effects

Academics researchers are suppliers as well as consumers of online search content. Academic research and writing practices are changing in many ways, not least the way GS exposes the relationships between your research and that of others through citation data. Here are 10 supply-side effects.

  1. If you’ve published academic articles or books, then it’s interesting and sobering, to see who’s cited your  work, and you can follow up further leads, or work on addressing critique. Often GS indicates zero citations for a particular publication. Prior to GS, most of us were blithely ignorant of such statistics. Now we know, what do we do about it, if anything?
  2. Whether you like it or not, anyone with access to a browser can see citation data about anything you’ve published. Non-science peer review forums, research assessment and grant awarding bodies, and promotions committees reject the idea that citation counts indicate quality or significance. But at least at a personal level, those little citation numbers appearing under GS search items are hard to ignore. However reliable they are, what do they mean for the researcher?
  3. There are also summative “scores” such as the H-index to conjure with. If you have 10 publications each cited in at least 10 other publications then your H-index is 10. For “10” substitute the variable x. Whatever your maximum value of x, that’s your H-index. GS can calculate this for you, and publish it on your own GS author page (or user profile) if you want to create one, along with a corrected and edited list of just your publications. Researchers into research practices have shown that this figure can be manipulated. How do we use or resist such indices?
  4. After using GS for a while you can see the huge difference in publication and citation patterns across different disciplines. Books are popular in the humanities, journal articles in the sciences. Articles in science journals can have larger numbers of authors than in the humanities, there are more outputs per academic researcher, and the citation counts are generally higher.
  5. Does a publication that’s never cited have no influence? There’s a time lag between publication and citation. Perhaps publications can exert influence in ways other than by being cited, such as influencing students, or the influence can extend to non-academic professional spheres.
  6. We should be sceptical of citations counts as a measure of quality. Just because an article is cited frequently doesn’t mean it’s any good, and there’s always the so called Matthew effect. Popular papers get cited because they are popular, amplifying the numerical difference between those at the top, and those at the bottom of the pile.
  7. With a bit of analysis, GS exposes academic cliques who always, and only, cite each other.
  8. GS also exposes authors who repeatedly cite the same sources, the usual suspects, and ignore the intermediaries, ie other less well known authors who also critique and review those sources. I’ve observed this in humanities and cultural studies areas. So Roland Barthes, Susan Sontag, Martin Heidegger, and Luce Irigaray may be generously referenced, but not other scholars who have deployed, critiqued, developed, refuted or applied such material. Perhaps changes in search practices will bring this material to light. There are fewer excuses now for ignoring the wider community of scholars.
  9. In a way, GS provides a means of sniffing out the marginal, the un-cited, the otherwise referenced, and avoiding the popular and the obvious — hopefully opening new avenues for exploration. It also reveals topics that are “under researched.”
  10. GS enables scholars to present their own GS page, that lists only their own publications, disambiguated, canonic, up-to-date, and with links to the pages of co-authors. Add this medium for personal academic profiling to other relevant social media tools such as Academia, Mendeley, LinkedIn, FaceBook, Google+, and Twitter and that’s a powerful network for letting the world know what you are up to. On the down side, it’s an extra burden to maintain this relentless online presence, and renders the research enterprise somewhat individualistic rather than group-based. The tools  primarily exist so you can profile yourself, not the team. It also floods the info-sphere with yet more stuff. Is it all needed?

Screen Shot 2013-04-30 at 22.58.55

Famous New Yorker cartoon by Barney Tobey: “Too bad about old Ainsworth. Published and published, but perished all the same.”

Notes

  • Access to academic journals will probably change in other ways as open access policies come into play. See Finch, Janet. 2012. Accessibility, sustainability, excellence: How to expand access to research publications (Report of the Working Group on Expanding Access to Published Research Findings). PDF.
  • Citation indexes have been around for years, but Google Scholar looks like a regular web search engine. If you don’t already know — you can enter loose-fitting search terms, and something is bound to show up. You don’t have to fiddle around entering terms into author, title and subject fields, and you can even misspell words. The search results returned include a mini abstract of the found articles, and a number indicating the number of times that article is cited by other articles, along with links to those article descriptions.
  • Thanks to GS’s extensive and growing databases, and Google’s deals with publishers, you can read many of those articles online, and if you are operating from within a domain (via VPN) that subscribes to the right services, eg a university, then you probably have free access to even more articles. GS points you to books as well as articles, and even searches the full contents of many books. You may not be able to download the whole book, but you’ll at least see a short abstract showing a few sentences  that include your search terms. Many books in part are available in preview mode through Amazon.
  • Unlike other citation systems, GS operates as a single access point to all these features, and if you take into account all the other resources of the web, including library search engines and document repositories, then you need hardly ever leave your desk, except to go out for a coffee, where you can do more of the same if the cafe has wifi.
  • There are limits of course. GS only refers to text-based outputs. It doesn’t process other highly useful research material such as blogs, private reports, exhibitions, creative works, designs, films, musical compositions, specialised archives, and experimental data, and you probably wouldn’t expect it to.
  • See the UK REF2014 guidance on how citation data is to be collected in assessing research in clinical medicine, biological sciences, chemistry, physics, computer science, economics, etc.
  • Scholars regard SciVerse, Scopus by Elsevier as the more authoritative citation database. Not least, it discloses the journals that it has in its database. GS doesn’t. You need to be in a subscription service to even access Scopus’ basic search and metadata, and it’s less friendly than the immediacy provided by Google’s search engine, though perhaps it’s more powerful.
  • Also see blog posts: Design-led research, Secrets of writing for the web, Research is a grey areaWritten any good books lately? The quantification of the intellectWhat has science got to do with it? and Universities as interpretive communities.

Bibliography

  • Bartneck, Christoph, and Servaas Kokkelmans. 2011. Detecting h-index manipulation through self-citation analysis. Scientometrics, (87)85-98.
  • Finch, Janet. 2012. Accessibility, sustainability, excellence: How to expand access to research publications (Report of the Working Group on Expanding Access to Published Research Findings)PDF.
  • Jacsó, Péter. 2005. Google Scholar: the pros and the cons. Online Information Review, (29) 2, 208-214.
  • Merton, Robert K. 1968. The Matthew effect in science: The reward and communication systems of science are considered. Science, (159) 3810, 56-63. PDF.
 

7 Comments

  1. bodyoftheory says:

    Thanks for another thoughtful post. I especially enjoyed your Point 8 on citation practices. I remember Don Ihde making a similar point (in Consequences of Phenomenology) in comparing citation habits even between continental and analytic philosophical traditions. He contrasts what he calls ‘vertical citation’ in the former (tracing a lineage of ideas back to the so-called ‘giants’) and the more scientific habits of the analytic philosophers who tend to cite ‘horizontally’ by referring to the work of their contemporaries. Although, in my experience, the spirit of common purpose this latter approach might suggest is often belied by the way the analytics seem to enjoy attacking each other in public. But overall I agree with your suggestion that by crossing subject boundaries GS supports – even encourages – interdisciplinary scholarship. For architects traditionally excited by just about everything, this could be both a blessing and a curse.

  2. Thanks Jonathan. It would be very interesting if architectural researchers adopted the practices of some other disciplines. My reading of the analytic philosophy literature is of a kind of precision, where one scholar will attempt to refute the arguments and conclusions of another, which indicates a kind of engagement you don’t see elsewhere in the humanities.

    The explosion in access, combined with the ever-increasing pool of publications available may indeed confound the ability of architectural scholars to focus.

    I like the vertical and horizontal metaphor.

  3. Zack says:

    Nice post – thanks!

    I was particularly interested in point 5. I have often wondered about the role that such search engines play in the formation of people’s impression and understanding of bodies of work. Also, when GS is (for many people – academics and students, alike) the primary reference tool for finding academic literature, what inferences are people drawing about the influence and scope of peoples’ work based on the search results that they are presented with?

    Clearly, academics know to cast their nets wider than the results presented by GS but students often don’t and, as such, fall into the trap of only reading and thus citing the ‘usual suspects’. Obviously, it is certainly a good idea for people to get familiar with these writers/books/papers but it is often the case that they are the ‘usual suspects’ because they are seminal or perhaps contentious, for example. As such, some of the more recent, relevant and interesting critiques/reviews/analyses of such work may not be brought to the attention of the searcher due to the nature of GS’s search methods. I do worry that, for some students, Google’s authority as a search engine (or even THE search engine) may limit the extent to which they engage with literature, particularly recently published texts. On the other hand, the potential for stumbling upon related literature from different disciplines or sources that would not normally be encountered is exciting and certainly beneficial in the promotion of interdisciplinary research. I suppose we need to try to make students aware of such issues and how to go about making informed literature searches both within and out with their immediate field of study…

    1. Thanks Zack. Good points. Looking up “Google Scholar” in Google Scholar yields a number of articles where authors review the order in which the search engine presents results — as I recall based on numbers of citations, proximity of search terms in the text, and no doubt other factors and biases. I think the main criticism is that the algorithms are not transparent.

      Key word search is obviously an important skill requiring good facility in English (if that’s the search engine language) and an understanding of the nuanced meanings of words in varied contexts. On numerous occasions I’ve undertaken web searches with students who had claimed they couldn’t find any literature on their subject. It really is a case of knowing the best search terms and quickly and iteratively modifying search.

Leave a Reply to Richard CoyneCancel reply