|
Dec15-content "Current Consensus" (Dec 15, 2003)
|
IAMTC
|
|
Document created Dec 15 or before, summarizing IL0, IL1, IL2
|
Current Consensus
We can distinguish a "theoretical" level and an "actual" level -- theoretical level is ways of thinking about how to annotate, the actual level is what we actually ask annotators to do
CONSENSUS SO FAR ON THEORETICAL LEVEL
- There are three levels of annotation, which we will call IL0, IL1, IL2.
- IL0 is deep syntax as defined for example in Owen's annotation manual.
- IL1 is something like PropBank: a language-specific representation which disambiguates further. Specifically, we will have a lexicon of verb meanings that all annotators must consult. When a verb meaning is determined, the annotator consults that frame to determine the labels for the arguments. In this approach, the identity of the label set is secondary, and can be changes easily *after* annotation.
- IL2 is something (more) language-independent that allows inferencing.
ACTUAL TODOS FOLLOWING FROM THEORETICAL CONSENSUS
Assuming that we want to actually annotate along the the lines of the theoretical summary above, this is what we would need to do for preparing the annotations. In addition to being in charge of one language, each site is in charge of one annotation issue.
- IL0: each site prepares a syntax manual for its language. Syntax site checks for compatibility and sanity.
- IL1: we choose label set. Lexicon site starts with PropBank English lexicon (=frameset) and augments/modifies it. As Source Language syntactic annotations come in, SL lexicons are devised in analogy to English lexicon. We need a mechanism so that annotators can flag new verb meanings and send to lexicon tsar.
ISSUES
- Christiane Fellbaum argues vigorously against annotating with WordNet senses -- experiments show that this is impossible with sufficient inter-annotator agreement.
OPEN ISSUES
- IL0: Owen will have a look at an annotation manual by Alon L, and report on how and what should be incorporated from that (or maybe we should just switch to it).
- IL1: verb meanings will be defined by lexicon, but these should be linked to some ontology. Also, we need to annotate nouns with respect to some ontology. So this ontology (or these ontologies) need to be chosen.
- IL2: lots of work to do. Steve raises the possibility of using different ontologies. Here is a partial list of things that IL2 might normalize, but IL1 would not:
- Non-literal use of language (X started its business vs X opened its doors to customers_)
- Conversives (X sold Y to Z vs Z bought Y from X)
- What do we actually annotate for March 04?
Version 2, Tue 06 Apr 2004 14:26:27 [OCR] - created Tue 06 Apr 2004 14:25:56 [OCR]