dinsdag 8 september 2015

Digital Humanities: From Source Criticism to Tool Criticism *

The political history of the county of Holland of the first half of the sixteenth century is rather well documented. There was a lively correspondence between the president of the Council of Holland Holland in The Hague, Gerrit van Assendelft , and the regent and stadtholder in Brussels. It is one of those coincidences of history that just because because stadtholder Anton van Lalaing resided in Brussels frequently (and not in the Hague), we have these sources at our disposal. This ritch correspondence however, does place the historian, like the younger version of myself ten years ago, in a difficult position. By only looking through the eyes of Van Assendelft at history the image gets distorted. His correspondence is biased by his personal visions on friends, enemies, relatives, and personal interests. Other opinions and visions are hardly available and when they are, for example when Van Assendelft was accused of corruption, heresy and nepotism, it is often difficult for the historian to assess which source is 'right'. Every student of history therefore is trained in a decent source criticism and to approach the sources related to his/her subject as objectively as posible. Of course there are few people who would claim they can approach a document completely objectively. Everyone is shaped by his/her own time, location and surroundings and develops sympathy/antipathy towards his/her subject.

So far nothing new. A good historian will always look at his/her sources critically and be aware that perspectives, including those of him/herself, are subject to change. What is less obvious, and what people seem to be only aware of to a small extend, is that tools for digital historical research also are far from objective. Just like a historian a tool gathers data and uses that to provide a synthesis/answer/visualisation. Just like a historian tools are filled with preconceptions/asumptions that can heavily influence the results of research. [1] If a tool always choses for a certain probability, for example that everyone without an exact date of birth always lived before the twentieth century, this can be a useful filter for one research question, but could have large and unwanted repercussions for the other. This realisation has the necessary consequences: every tool a historian uses should be criticised like a fellow historian, or even as a (sometimes very sloppy) co-author or student-assistent. This means that all choices which were made when developing a tool should be made explicit, and that ideally the complex algorithms which form the core of a tool should be understandable for the person using it. There are very few historians however, who have the necesary technical expertise, at least that of a bachelor in computer science, at their disposal to truly understand the finer nuances of computer code.

The question then is what could be done to breach the gap between the historian and technology. The most simple answer of course would be that the historian also must become a computer scientist  [2] or the other way around. Even though in the future there hopefully will be more of such hybrid academics than now, it is unlikely that we will have thousands of such people in the near future. One of my history teachers at University once said: 'A historian needs to be an amateur in every field,' Maybe it is enough to become an amateur in computer science as well. Traditionally, historians become amateurs in the fields of law, ancient languages, geography, archival science, art history, psychology, codicology and sociology. Computer science could simply be added to this list. Just like the other fields of study, computer science is an aid d to interpret all of the available data correctly.

It still is an open question what level of amateurism in computer science is acceptable to use digital tools wisely. Since digital humanities is a still emerging field this question knows many answers. Historians have used methods from other fields to various degrees over the centuries. The historian of a hundred years ago could not have predicted that statistics is now a widely accepted skill to analyse historical material and that Latin is becoming obsolete in many curriculi. I would say that necessary and (for now) sufficient conditions to use digital tools properly, are: 1) the availability of a detailed documentation of the choices made by the computer scientist, and 2) an understanding of how a computer scientist works and why he/she had to make certain choices. Or in other words: to a certain extend we need to master the languages of a computer scientist passively, which is also the level of how much historians grasp most other fields.  I read medieval French, have a basic knowledge of the work of the sociolologist Bourdieu, and I know what the legal terms mean in medieval verdicts. I would however never be able to speak medieval French (or even decent modern French), have no knowledge to be able to criticise Bourdieus work and have no clue if a medieval verdict is in line with how justice was applied in general in that time ... and I get away with it.

To graps how a tool works, historians therefore should not necessarily be able to convert a text to linked data, but should be able to grasp to a basic extend how this process works and what RDF triples are. This would entail a cultural change, in which tools are not only used as household appliances. but as the product of another academic field, that need to be approached critically before you can use them. Often historians stop at asking themselves how a tool can help them to answer their questions, while the importance of knowledge of how a tool is built and can be approached eludes them. Without such knowledge there can be no decent tool criticism,  which will become increasingly important besides the familiar source criticism.

* This is a (bad) translation and slight adaption of my blog from 24 June 2014

[1]See the important article of B. Rieder and T. Röhle: 'Digital methods: Five
challenges' .in: D. M. Berry ed., Understanding Digital Humanities (2012) 67–84.
[2] Throughout the text computer scientist can also be read as computationl linguist.