Big Data and Data Visualization

            This week we are exploring big data in Graham et al.’s work Exploring Big Historical Data: This Historian’s Macroscope. Throughout this text, the main theme was exploring the digital tools that can be used by today’s historians (or already are in use). Among these skills is Zotero – a tool that enables users to save and export citations, which can be a lifesaver for researchers. On page 6 of Big Historical Data Zotero is cited as a tool for finding commonalities: “Using a plugin, a little program or component that adds something to a software program, for the open source reference and research management software Zotero, Fred Gibbs at George Mason University developed a means to look at specific cases (e.g. those pertaining to “poison”) and look for commonalities…Through comparing differences in documents (using Normalized Compression Distance, or the standard tools that compress files on your computer) one can get the database to suggest trials that are structurally similar to the one a user is currently viewing.” This is one example of the tools historians are using to conduct big data research in order to gain a better scope of ‘the big picture’ in historical occurrences. Using tools such as Zotero (which I picked because I also use it in my work) have made it possible for big data research to be conducted without the headache-inducing amount of resources it would have required before the availability of open source tools.

As Graham et al. state, “There are three issues of critical importance to understanding big data as a historian: the open access and open source movements, copyright, and what we mean by textual analysis” (p.38). While this quote outlines the topics that a historian needs to understand in pursuing big data, it also shows the limitless potential those tools possess to a historian capable of seeing ways of reimagining data to catch a person’s attention. Up until a short time ago, information was only reported with fairly uniform methods: in itemized tables and lists with accompanying reports arranged by topic (generally chronological). Now it has become acceptable to display data in new ways that can spark understanding in a variety of observers. Data is being analyzed and displayed into word clouds, in line graphs, and scatterplots (at times using colors to contrast different topics and their frequency) – these new means of data visualization allows historians to reach many more people. And this new incorporation of data visuals increases comprehension in users.

An excellent example of the applications of big data research and data visualization for historical research is the Viral Texts project. This project has several components: the Love Letter Exhibit, Fugitive Verses edition, and a visualization of the network of “Viral Text” sharing from 1836-1899 – these are a few examples of the work the Viral Texts project has done. Among these, I want to focus on the visualization for the “Viral Text” network: this interactive graphic allows users to zoom into the image and select nodes from within the mass of connections in order to isolate one node in order to see its information and which nodes with which it connects. Users are able to zoom in and out of the image to get a better view of the hundreds of nodes. I think interactive visualizations like this are able to communicate more information than a written report ever could – the interactive component is far better for keeping a user’s attention (especially if the user is not a history major and may have come across the website by coincidence).

The Digital Humanities is a field that is rapidly expanding, though I don’t think everyone knows what to do with it. Projects such as the Viral Texts project give in insight into what historians others within the Humanities can do to integrate their work into the ever growing tech world. By incorporating tools such as Zotero, Tableaux, AntConc, and Voyant Tools when publishing research, historians can better claim a platform in this digital age.