Context

While areas such as syntactic and semantic tagging of isolated words has been much studied, and is still under intensive research, work on the syntax and semantics of texts is still in an early stage of development in NLP circles.  In this context, and within the framework of our TextCoop project, we focus on the syntax and semantics of procedural texts. In our perspective, procedural texts range from apparently simple cooking recipes to large maintenance manuals. They also include documents as diverse as teaching texts, medical notices, social behavior recommendations, directions for use, assembly notices, do-it-yourself notices, itinerary guides, advice texts, savoir-faire guides etc. Procedural texts follow a number of structural criteria, whose realization may depend on the author's writing abilities, on the target user, and on traditions associated with a given domain. Procedural texts can be regulatory,  programmatory, prescriptive or injunctive.

Overview

The first step of the project, which has now been implemented as a prototype, aims at annotating the different discourse parts of procedures, outlining the structure of titles (goals), instructions, instruction compounds, warnings, advice and a number of rhetorical forms that describe the different facets of explanations.

A prototype, fully implemented in Perl has been realized and evaluated. The second step of the project is the investigation and the implementation of advanced uses of this environment, among which:

  • Customization of the software for concrete applications (in the industry or for the large public), in principle, little resources need to be integrated,
  • Development of patterns for various languages (English, Spanish, Asian languages),
  • Development of oral dialogue and multimedia facilities, e.g. for help desks,
  • Procedural text annotation and enrichment via the definition of additional patterns, this may include adding annotations to make tasks more precise, or analyzing argument structures: e.g. tools, durations of instructions, also temporal and conditionals analysis, etc.
  • Development of relatively feasible tasks: identifying prerequisites from instructions (tools, consumables, etc.), identifying the number of required participants, the approximate duration of the task (and possible idle periods), etc. This is realized mainly via lexical inference and a close analysis of instructions and their connectors.
  • Development of advanced functions: among which: coherence and cohesion detection, procedure fusion, procedure simplification, constructing larger procedures out of simpler ones, zooming on difficult instructions, development of additional warnings and advice, analysis of the difficulty and the risks of a task, etc. Most of these tools involve both linguistic and reasoning aspects, and the taking into account of the domain specificities.

This project has many applications: e-farming (project with Thailand), document enrichment (project with EADS), procedure fusion, etc.

Besides processing procedural texts, we pursue a foundational work on explanation structure: what kinds of strategies are deployed by authors to help and convince users of procedural texts. The aim is also to identify explanation schemas and their linguistic realizations so that they can be used in language production.

Contributors

Main Publications

  • Isabelle Dautriche, Patrick Saint-Dizier. A Conceptual and Operational Model for Procedural Texts and its Use in Textual Integration.  International Workshop on Computational Semantics (IWCS 2009), Tilburg, january 2009.
  • Lionel Fontan, Patrick Saint-Dizier. Analyzing the explanation structure of procedural texts: dealing with Advices and Warnings,  in: International Symposium on Text Semantics (STEP 2008), Venise, Johan Bos (Ed.), Association for Computational Linguistics (ACL) and book, series in Computational Semantics, september 2008.
  • Lionel Fontan, Patrick Saint-Dizier, Constructing a Know-How Repository of Advices and Warnings from Procedural Texts, in: ACM International Conference on Document Engineering, Sao Paolo, Dick Bulterman, Luiz Soares (Eds.), ACM, septembrer2008.
  • Estelle Delpech, Patrick Saint-Dizier.  Investigating the structure of procedural texts for answering How-to Questions, in: Language Resources and Evaluation Conference (LREC 2008), Marrakech, European Language Resources Association (ELRA), mai 2008.