Prof. Mike Thelwall’s interesting and engaging lecture saved the catastrophic IT disruption in AC108 on the first session of the Research module. Students did not have login to the wireless network, nor to the VLE Blackboard. And all the materials (slides and documents) I placed in the VLE were not viewable because there were no PDF viewer nor Office Suite installed in any computers (and it took too long to download the PDF viewer and Powerpoint viewer given the slow and flaky wireless network). The whole session was behind schedule. Despite all these, Mike managed to engage everyone in the classroom with a brilliant introduction to a range of research tools for collecting and analysing Internet data.
He talked about how he used LexiURL to analyse the web environment of the Cake Wrecks Blog and BBC World Service. Audience was curious about what the sizes of the nodes, the distance between two nodes meant in those social network graphs.
BlogPulse and Google Trend proved to be popular. Searching for “Michael Jackson” on BlogPulse, one could see a dramatic difference in the number of discussion between before the day he passed away and the day after. These were very useful for time-oriented analysis/comparison. As remarked by Thelwall, “only blogs can allow you to do this kind of retrospective analysis.” A participant also recognised the value of Twitter: “the data is quick and dirty but it’s as best as you can get in real time.”
Google Trend is based on the search of the “search words” on Google, but unlike BlogPlus, users of Good Trend cannot click to see the searched content and see why people made this search. The search words are also language specific. Users need to know the “exact” word others used to do the searches in order to get useful and valid data.
Thelwall introduced an advanced command “site:” for Google search. One can do
site:twitter “John and Edward”
But Thelwall reminded everyone that internet data and Internet-based research methods are not without limitations. As a researcher, one has to ask him/herself how representative the data is, and how valid/authentic/truthful the data is.
Our 6 very keen and enthusiastic students did so well in the workshop that the whole day had a very good ending.
Kate and B looked into “Nick Griffin on Question Time” on YouTube. They found it difficult to figure out whether commenters debated or agreed with each other. Commentators did not seem to be in a dialogue, they found. I thought that was interesting because it suggested that using video comments to research debate / discussion about current affairs might not be an effective method. A research project needs a well-defined research question, effective/robust research methods, good-quality data, and sharp and deep analysis.
Brian did a search on “Sol Campbell” the footballer on Google Trend. The result were weird: the language used most for searches was Norwegian, while the region where most searches done was UK. And the search volume index was not available before mid-2003, while the news reference volume was. He did another search on “beckham” to see if it was an exception. But this time it was even more weird: the the language used most for searches was Indonesian, followed by Norwegian, Vietnamese, Swedish, Danish and then English. Both the search volume index and news reference volume were available before mid-2003. But there was no way we could solve the puzzle as Google Trend is a black-boxed tool – there’s no way of seeing how these results came into being, and there’s no way of seeing what’s behind the results.
Again, I thought this was interesting because it raised a serious methodological issue about quality and validity of Internet data, and importance of the transparency of web-based research tools (or any research tools). If researchers could not see how research tools process data and generate results, they could not tell where it went wrong. A transparent, well-documented data collection and data analysis process is key to systematic / scientific research.
Unfortunately, Google is not alone in black-boxing things (ironically, Google helps to discover more information – but what are the things they hide from us?). A lot of research tools social scientists use nowadays are proprietary software and not transparent enough. When results go wrong, there is simply no way to figure out what goes wrong.
Anyway, this post is getting too long. In the end, I’d like to thank:
Mike for delivering a very successful and well-received lecture and workshop;
6 MA students for being keen, enthusiastic, supportive and understanding;
others who participated in Mike’s lecture and asked a lot of interesting questions.
Because of you, I’m encouraged. Oh, yes. Definitely. I’m determined to do much better next time.