In the days before the internet, libraries were a much more important source of free information for many people. What they lent, when and where from was, until recently, recorded by the Public Lending Right (PLR) body in the UK.
I (vaguely) remember the data that the PLR used to collect being used by the media as a gauge of public interest in a particular topic, book or genre however, as it was by its nature out of date by the time it was published, it was of limited use.
Google is our librarian now and the data that it gathers not only tells us what people were interested in, but what in particular (a bit like being able to tell which specific paragraph of a book people were interested in, rather than just the fact they were interested in that book).
The data also tells us when specifically they were interested in that topic, where they were, if they were satisfied with the information they received and even what demographic they were likely to belong to.
Coupled with all of this extra information, we also know what they wanted last week, rather than waiting for six months and, as the internet provides information on virtually any topic, we have a full on data set that details virtually anything that anyone has ever been interested in knowing.
So what I’m trying to say is that the data collected by Search engines and ISPs when you search for something online is a bit of a treasure chest. That little query string at the end of your URL may not mean much taken on its own (apart from to those closest to you), but when fed into many, and aggregated, and sliced, analysed and interpreted, it becomes one of the most powerful sources of potential knowledge and insight into human behaviour that we have today.
In my opinion, that knowledge is largely untapped due to lack of knowledge of its existence, the expense of getting more than a limited view of the data through third party tools and partly, perhaps, due to the lack of high profile use cases.
There are some researchers and organisations however that are making use of this data, mostly in Health, Finance and Marketing, paving the way for others.
I’ve made a start in listing them here. If you know of any other examples, feel free to comment and I’ll add them into the list.
1. Predicting unemployment
In March 2013, four academics from Beijing’s Renmin and Tsinghua universities published a paper detailing how using search engine data had outperformed traditional methods of predicting unemployment .
Similar results were achieved by German researchers, from Bonn university in May 2009
2. Knowing when people are abusing drugs
In November 2012, a paper was published by the Clinical toxicology (Philadelphia) journal detailing how internet search data could be used to detect outbreaks of people abusing drugs known as “bath salts”.
3. Measuring public awareness of Erectile Dysfunction
The Journal of the British Association of Urological Surgeons, BJU international published paper in December 2012 looking at public awareness of erectile dysfunction in Ireland, following a series of public awareness campaigns
4. Predicting outbreaks of Dengue fever
In August, 2011 a paper from PLOS Neglected tropical diseases concluded that “Internet search terms predict incidence and periods of large incidence of dengue with high accuracy and may prove useful in areas with underdeveloped surveillance systems.”
5. Predicting outbreaks of the flu
6. Making loads of money from the stock market
Okay, so there’s a little bit of supposition in that but there have been studies linking search data to stock market activity and if anyone knows how to use data to make money, it’s got to be stockbrokers, right?
7. Helping computers understand humans
Microsoft looked at using search data to help machines understand human speech in this paper
8. Predicting house prices
A study by researchers from MIT said “We found evidence that queries submitted to Google’s Search Engine are correlated with both the volume of housing sales as well as a house price index”
9. Knowing when we’re more likely to spend
10. Selling you things online
Google and other search engines have long made their search data available for advertisers to research what their website visitors are most likely to search for and so shape their Ads, content and even website architecture accordingly
- Internet Searching For Side Effects Can Discover Side Effects? (pascophronesis.wordpress.com)
- Unreported Drug Side Effects Found In Web Search Data (medicalnewstoday.com)