(This is part of my Life after Google project which, as you’ll recall, allows me to use Google for the purposes of illustrating a point. So, don’t start ragging on me because I used Google.)
If you’ve taught any library instruction sessions over the past few years, you’ve probably had that helpful student who points out that “Google has everything!” I had That Guy this past Friday and he wouldn’t back down: library instruction is unnecessary because he can get everything he needs using Google and Google Scholar. “I don’t really need to use the library ’cause it’s all in Google anyway,” he said. Maybe you’ve had the same student in a recent class? Maybe you’ve had a faculty member or administrative-type express the same sentiment? Maybe you’ve given in to your anger and lashed out in a cardigan-bedecked fury, leaving behind a room of broken bodies covered in cat dander? Maybe not, but whatever the case, it sure is annoying, isn’t it?
So, how do we counter the popular belief that everything is in Google? Sure, we can talk about credibility, about the cost of subscriptions, about search engine optimization, about the difference between the Surface Web and the Deep Web…I’m sure you have approximately ninety bajillion responses to the Bill Mahers of the world. But you know what sticks? Numbers. If you really want to drive the point home that Google is only a moderately helpful research tool, why not quickly show your students that, far from being “everything”, a Google search returns fewer articles than a fairly standard library database? It goes like this…
When Friday’s student insisted that Google has “everything”, I decided to call him on his bluff. I looked him straight in the eye and coolly said, “Boy, I’m ’bout *this close* to smacking the taste out your mouth.” And, out loud, I said, “Want to put that to a test? What’s your topic?” “Alcoholism,” he replied. Now, this was the part of class before we talk about narrowing topics, so I indulged him in his overly broad topic. I pointed down the middle of the room and asked everyone on the left side of the room to go to Google and look up “alcoholism”. The students on the right were to go to the rather ordinary Academic OneFile database and do the same, limiting just to full-text articles. Here’s a screen capture from Google:
Notice, there are supposedly 5.07 million articles available. Wow. What does Academic OneFile have in full-text?
Academic OneFile has 5,272 academic journal articles, 3,531 magazine articles, 11,875 news articles, and 669 other sources at 8:13 p.m. on November 7, 2011. That’s a rather paltry grand total of just over 21,000 full-text articles. Crap. The Google kids are right: Google has everything! Needless to say, the students on the left felt vindicated…until I asked them to scroll to the bottom of the page and look at the next page of results. And then the next page. And the next page. On the smartboards in the front of the room I advanced through Google’s results ten at a time until we all got to this:
868 web pages. That’s it. Adding the omitted results brings it to an even 1,000. Now, about that 5.07 million? Maybe Google can reduce their figure by, oh, I don’t know, about 99.9998%. Google may index more than five million websites related to alcoholism, but the search results are capped.
It’s as simple as that. If your students argue that Google has everything, show them that a basic library database offers 20 times as many articles in full-text. Even a Subject Search for ‘Alcoholism’ yields more than 13,000 articles. Heck, the narrow subject of ‘Alcoholism, Genetic Aspects’ has almost 657 articles, compared to Google’s 703 articles for ‘Alcoholism and Genetics’. I’m telling you, letting the students see these numbers for themselves can quickly sway them back towards the library. Add in the cherry that they won’t have to worry about whether the library source is acceptable as one of their minimum of 15 sources, and you’ve got a compelling argument that will sway even the most die-hard Google fan.
That is, of course, assuming the Google fan is relatively inexperienced in academic research. With an experienced understanding of how to manipulate Google results, you can get some amazing things. Try playing A Google A Day if you don’t believe me. An experienced researcher knows how to tweak filters, pick the right keywords, and get freaky with the Boolean operators. The trick I’m suggesting isn’t for them; they already know that Google has a lot, but it doesn’t have everything. The trick I’m suggesting is for the novice researcher. It’s for library instruction classes, not one-on-ones with faculty and graduate students. It’s for students with broad, Freshman-level topics. It’s just a rhetorical trick designed to call into question the commonly held belief that you can find more in Google than in the library. And, as a rhetorical device, it introduces valuable questions. Why does Google cap their results? How useful is it to have millions of results? How does Google decide which 1,000 results to display? Sure, Google may have 50 billion pages indexed, and you may find websites on just about everything, but sometimes it’s nice to be able to show that, from a practical standpoint, the library has more.
[…] library and I am the “official” research source for most family and friends, so I found this post in the blog Sense & Reference by Lane Wilkinson to be a compliment to our first Chutes and […]
not to mention google’s (and others’) “invisible algorhythmic editing of the web” http://www.ted.com/talks/eli_pariser_beware_online_filter_bubbles.html
This is excellent. I can’t wait to use it the next time I’m dealing with a “That Guy”
During the last National Library Week, I gave a talk which stessed the point that the research that culminated in my recent book — while aided by some digital tools — could not have taken place without the existence of physical stacks in a real library: http://www.youtube.com/watch?v=LwV9_RLCBAA
Of course, it’s a scholarly work, consisting of translations from the 1930s French debates about Christian philosophy, with an extensive introduction and chronological bibliography — so the average undergrad might well say: so what? But, perhaps many of those “that guy”s are in fact interested in how we produce and extend knowledge, so perhaps that sort of testimonial could possess some persuasive force
[…] Google has all! (but the library has more!) | Sense & Reference […]
Very nice. I knew about the capped search results in google but never occurred me to use it this way. I am going to steal this example. I imagine though someone might say who cares google shows only 500 or 1000 results, nobody is going to look past the top few results, all but the most obsessive researcher/libraries probably stops at 100 or so. I am curious about how the google capped searches work. What happens though if you drill down further by saying selecting “This week” ? It’s not’s going to just drill down from the top say 703 (your example)? Or does it?
Aaron, that’s a really good point that I should have made clearer. The vast majority of Google users tend to select from the first page of results and then consider the search completed. So, in practical terms, Google really only provides 10 results that are likely to be seen. Academic databases (and other search engines) suffer from the same “10 and done” search mentality, so I think it’s a universal problem we should be addressing: how does Google rank search results? How do other research tools rank results? As to filtering and drilling down in Google, that really is the best way to go about web searching, and I admit as much in class. My in-class example is only directed at novice researchers who probably aren’t at the level of constructing initially optimal search conditions. Students in my upper-level classes tend to realize the limitations of search engines and we do discuss more advanced search strategies to make the most of Google.
[…] https://senseandreference.wordpress.com/2011/11/10/google-has-everything/ […]
This is an excellent article, Lane. I have shared it with my library managers, school librarians that we serve, my library peers throughout Alberta, and also with The Alberta Library. Thank you for providing an article with such a clear example.
No problem Jim. I’m glad I could be of service.
I loved your description of vengeance upon “that guy.” Also, excellent points. I’m off to conduct my own Google experiment as we speak.
Let me know how the experiment turns out.
Really thought this post was great and really proves the point that Google is not the end all and be all of research. It barely scratches the surface of resources and offers limited results that may or may not be helpful.
This is a great idea but sad to say it doesn’t work for me, on my own or your example search. I’m searching google.com.au Also tried scholar but didn’t get to an end point in 30 some pages. I also get 27 mill not 5 mill for your alcoholism example. A while back I tried a similar experiment after seeing a TED talk about the personalisation of google results – comparing a friend’s results on same search and this didn’t work either. (Eli Pariser: Beware online “filter bubbles”) Is .au saving me?
Thanks MAT, this is fascinating! I searched Google Australia and, like you, retrieved 27 million results, though with only the first 771 available. Google Scholar seems to cut off at 100 pages for me. I’m off to figure out what’s going on!
Great activity! I am constantly frustrated by students who rely on the Top 10 Google search results and fail to pick up something called a book. I think we we need more fun and hands-on sessions for students to be familiar with library resources, both online and offline.
Nice activity however with some possible flaws. Like you said its ok for a chat with a novice or a student.
1. Google does not show you all the results, in the interest of performance. Would a user rather wait for a couple of minutes to get back all the results, including those that they would never care about OR would a user prefer to see the close answer to their question immediately.
2. If you refine the search, the algorithm, does not refine only the first few pages, but the entire result set. So that’s clearly not a limitation.
3. Reiterating point 1, independent of performance, users stick with something based on trust. How often have you, as a librarian used Google and how often have the first few hits answered your question. Google actually does all the heavy lifting for you by employing its fantastic relevance algorithm.
4. And with Google now adding the ‘Circles’ based social trust factor to its results, people will reply on the results even more.
Just lets all be aware that this is a totally losing battle …
Naresh, I agree with you in part. Yes, Google caps results in the interest of performance and there are meaningful ways of refining the initial search parameters. Of course, every time we refine our initial search syntax and create a more complex query, we are tacitly admitting that the Google algorithm is insufficient for our purposes. Put another way, the Google algorithm is more often in need of correction than we would otherwise like to admit.
You also ask whether I, as a librarian, trust the first few results that appear in Google. This question is a bit of a red herring. You see, the vast majority of searching in Google is for simple, known quantities. We search for websites whose URLs we’ve forgotten, telephone numbers for colleagues, articles in Wikipedia or the Stanford Encyclopedia of Philosophy, newspaper headlines that include a specific word, etc.. On the other hand, Google is of marginal utility in academic research. What is the most frequently used definition of “information” in the field of information science? How do theories of knowledge affect library praxis? These sorts of questions cannot simply be Googled.
Finally, the addition of ‘Circles’ is little more than an implicit and fallacious appeal to popularity masquerading as improved search behavior. I don’t want to know what the most people believe, I want to know what’s right.
So, no, I don’t see a losing battle. I don’t see a battle at all. Google is just one way to perform a particular type of research. (Incidentally, I haven’t used Google in over a month now. Yahoo and Blekko are more than adequate.)
Great experiment!
Something funny seems to happen.
Searching “corrosion” will result in 845 links.
Is there something special about 850 hits?
I don’t know that there’s anything special about 850. Though I haven’t done enough test searches to make any generalizations, it does seem that a lot of search results are first capped around 800-850 (and 1000 when you add omitted results).
[…] – Musings about librarianship. One of Aaron’s posts from 2011, which originally came from Lane Wilkinson’s Sense & Reference blog (2011), was titled 3 things to show at library sessions and made me realise that, apart from some of the […]