Workshops
Language Resources
Report to the Interactive Systems Grantees' Workshop | Report to the Interactive Systems Grantees' Workshop |
|
|
|
|
These slides were prepared to summarize the presentations and discussions
at the Language Resources Workshop, held 8/16/97, for the attendees
at the NSF ISGW, in a 15-minute presentation given on 8/18/97.
Along with the workshop background material and the Language Resources Primer referenced below, these are raw materials for the final workshop report that Mark Liberman and Ron Cole will write. Language Resources WorkshopSkamania Lodge 8/16/97Workshop goals:
Your comments are welcome.
What are "language resources"?
The relevance of evaluation in Language Engineering is increasingly recognized. This involves assessment of the state-of-the-art for a given technology, measuring the progress achieved within a program, comparing different approaches to a given problem and choosing the best solution, knowing its advantages and drawbacks, assessment of the availability of technologies for a given application, and finally product benchmarking. It accompanies research and development in Human Language Technologies, and has driven important advances in the recent past in various aspects of both written and spoken language processing. Although the evaluation paradigm has been studied and used in large national and international programs, including the US ARPA HLT program, EU Language Engineering projects, the Francophone Aupelf-Uref program and others, particularly in the localization industry (LISA and LRC), it is still subject to substantial unresolved basic research problems. The good newsGenerally available language resources in 1986:
How to satisfy them most efficiently and effectively ? What we did on SaturdayMorning -- presentations:
Some recurrent themes1. Improved hardware/software leads to
3. Intellectual property rights will be a continuing problem, because
5. Still too many barriers to resource sharing among researchers!
Resources for Future NeedsA brainstorming exercise.Three general areas:
For each goal, what key resources would be needed to support a research effort?
Saturday's goal was a preliminary set of examples,
I. Multimodal machine translationA. Sample task definitions1. text, speech, OCR in multiple languages >> text in English2. computer-mediated multilingual communication3. talking head in >> talking head outB. Resources needed
Additional multilingual speech and text data (e.g. large VOA corpus) Multilingual lexical resources, including lists of fixed expressions, proper names, and idioms Annotated multilingual text (sense and structure) Multilingual speech synthesis corpora Resources for high-quality generation II. Multimodal understandingA. Sample task definitions1. computer as moderator/facilitator of keyboard-based collaboration(email exchanges, IRC, and similar things)2. meeting summarization based on video and audio monitoringB. Resources needed
Coding schemes for useful annotation Part of "American national multimedia corpus"? III. Universal accessA. Sample task definitions1. phone-based access to on-line informationObviously useful technology, but controversial as key to universal access2. speech and language technology applied in primary educationTo teach language skills or in support of teaching other topicsB. Resources needed
Better corpora of human-human information access Speech-language-interaction data from school children |
| < Prev | Next > |
|---|