e-Science 2008 4th IEEE International Conference on e-Science

Workshops & Special Sessions

e-Humanities—An Emerging Discipline


  • Peter Wittenburg (Chair), MPI, Nijmegen, The Netherlands
  • Laurent Romary, MPDL, Berlin, Germany
  • Sheila Anderson, AHDS, London, UK
  • Peter Doorn, DANS, Den Haag, The Netherlands
  • Tamas Varadi, Academy of Science, Budapest, Hungary
  • Steven Krauwer, University Utrecht, The Netherlands


In the Humanities the availability of new digital technology and increasing amounts of digitized data has triggered the development of several novel research methods. The capability of creating and using large digital collections of structured and unstructured resources and the emergence of powerful algorithms for processing the data from multiple perspectives is already affecting all Humanities disciplines. However, to reap the full benefit of e-Science approaches, a number of issues that are specific for the Humanities must be addressed. It is the aim of this workshop to do just this.

In the past many resources have been made available in digital form. These include texts, multimedia documents, but also a wide range of meta-data, from annotations of documents, via lexicons and taxonomies to grammatical descriptions of many natural languages. Since these resources have been created independently, in the absence of standards for character encoding, file formats, annotation systems, access rights and IPR, these resources do not interoperate. Yet, the full benefits of e-Humanities can only be had if independently created resources can be combined, as if they formed one large resource. Therefore, substantial work remains to be done to reach a situation in which each scholar can peruse the combined resources with the same ease as if they formed one homogeneous resource.

So far only a fraction of the existing documents that are of interest to the Humanities has been digitized. The same holds for knowledge sources such as lexicons and grammars. Thus, we are seeing, and we will be seeing, projects aimed at digitizing additional resources. To avoid the need for expensive repair measures to enable interoperability after the completion of these projects, standards for all levels .from character encoding to the semantics of meta-data- must be developed. Standardization activities are under way, but they are far from completion.

The distributed character of the resources, in combination with local expertise that is needed to keep them up-to-date, naturally leads to a Data Grid. The enormous amounts of computations necessary for advanced automatic pattern detection and other machine learning techniques gives rise to the need for using Grid Computing. Both aspects of the Grid-based processing are likely to pose special requirements related to the type of data, the type of questions that scientists ask, and the access rights.

The specific questions addressed in the Humanities and the specific types of data that are of interest require the development of dedicated algorithms. Even if these algorithms can be adapted from related disciplines, there is still a large amount of work to be done before the toolbox for e-Humanities research is reasonably complete and before existing tools can easily be combined to workflow chains by the humanities scholar who is not an expert.

e-Humanities can only be successful if it is possible to provide computer tools that support scholars in their research, rather than forces them to spend lots of time learning how to use new tools, or even worse, developing new tools. To prepare researchers for using the emerging e-Humanities tools, novel courses must be developed for undergraduate and graduate programs. However, even the best possible education cannot compensate for bad design of the tools. Therefore, the e-Humanities toolbox must come with an excellent user interface.

Date and Time

Wednesday, December 10, 10 a.m.–12:30 p.m., Reconvening 1:30–6 p.m.


9 a.m.
Keynote talk
10 a.m.
Introduction to the Workshop
P. Wittenburg
10:15 a.m.
No Claims for Universal Solutions
T. Blanke, A. Aschenbrenner, M. Küster, and C. Ludwig
11:15 a.m.
Coffee Break
11:30 a.m.
Managing and Integrating very large Multimedia Archives
D. Broeder, E. Auer, M. Kemps-Snijders, H. Sloetjes, P. Wittenburg, and C. Zinn
12:30 p.m.
Lunch Break
1:30 p.m.
The e-Linguistics Toolkit
S. Farrar and S. Moran
2:30 p.m.
Visualization of Dialect Data
E. Hinrichs and T. Zastrow
3:30 p.m.
Putting Data Categories in their Semantic Context
M. Kemps-Snijders, M. Windhouwer, and S. Wright
4:30 p.m.
eAQUA—Bringing modern Text Mining approaches to two thousand years old ancient texts
M. Buechler, G. Heyer, and S. Gründer
5:30 p.m.
Discussion and Conclusions
6 p.m.
End workshop & start poster session

