LIRICS Software: Automatic Quality Control

lirics econtent eu logo

Automatic Quality Control

Neil Newbold, Lee Gillam

Document Content Management System - Readability Tools

The Document Content Management System is a flexible package of integrated readability tools designed to enable automatic quality control. The system uses supporting resources and components for the standards development process, including a Plain English thesaurus, lookup of ISO TC 37 terminology provided from a terminology management system (TMS) via ISO 16642, automatic terminology discovery using statistical and linguistic techniques, and readability metrics. The work was undertaken as part of the EU funded project LIRICS, which involved collaborative efforts with project partners from international research institutions including INRIA (France) and DFKI (Germany).




Installation



Benefits of the Document Content Management System



Process

The readability components have been integrated with the University of Sheffield's GATE system which offers a set of reusable processing resources for common NLP tasks. These resources are packaged together to form ANNIE, A Nearly-New Information Extraction system. Existing GATE plug-ins from ANNIE were utilised for the preliminary tasks, leading into the newly devised readability processing resources. The Readability Analyser can be run at two seperate points in the pipleline to either incorporate or ignore terminology. Once the process has completed, it can be reinterated to refine the quality of the text. The pipeline of the processing resources are featured in the diagram below:





Readability Tools


For further information on Readability Tools for GATE, please contact:
Readability Tools Support: Neil Newbold
Fax: +44 1483 876051

 

Top of Page | Disclaimer | Accessibility | Freedom of Information

Valid XHTML 1.0 Transitional