Guiding Theme B1: Structured summaries of complex contents

Guiding theme B1 addresses structured summaries, which provide an overview of and access to documents covering a specific facet (i.e., a certain topic, person, event, etc.). The reason for this is that summarizing multiple documents in a brief running text inevitably means to cut interesting details of individual documents. guiding theme B1 addresses structured summaries, which provide an overview of and access to documents covering a specific detail (i.e., a certain topic, person, event, etc.). Users of an automatic system providing structured summaries should be able to explore “Big Data”, i.e. a document collection using a hierarchically ordered “table of contents” across all documents.  They should receive detailed information on the specifics of each document (e.g., as a list of keywords or as a short text snippet). Erbs, Gurevych, and Zesch (2013) give an introduction to structured summaries for single documents.

A Ph.D. project that primarily follows this guiding theme is expected to conduct innovative research on creating structured summaries. This research will utilize keyphrase extraction, measuring semantic relatedness, and/or graph-based word sense disambiguation. A major challenge is the generation of structured summaries and navigation trees for a heterogeneous document collection covering different domains, genres, or text types. The applicant will work closely together with researchers studying discourse graphs (guiding theme A1: Multi-document co-reference resolution for heterogeneous sources) and ranking methods (guiding theme C2: Methods for contextual and constraint-based ranking). She or he will also collaborate with the other two guiding themes of research area B: Natural Language Processing for multi-document summarization in order to consolidate the summarization-focused research activities of AIPHES. The compilation of evaluation data and performance metrics should be coordinated with research area D: Criteria and methods for quality assessment of heterogeneous sources and dossiers.

Poster (in German)

Example thesis topics

  • Ranking methods for entities in structured summaries of heterogeneous sources
  • Generating topic trees as table-of-contents for document collections
  • Intrinsic and extrinsic evaluation for structured multi-document summaries of heterogeneous documents



