Edit | History | Changes Home page | Site map | Search | Recent changes | Help

02-16-04 Notes

Here are some of the questions, issues, and (semi)answers from the phone call:

1. There is a good deal of confusion about assigning theta roles. We are looking forward to additional information about assigning these roles, and will try to get this on-line as soon as possible. It was mentioned that for our purposes right now, we only assign theta roles to the dependents of verbs. Adjectives and nouns may also have reasonable theta roles, but right now we are only doing this for verbs.

A particularly good example of the confusion was the coding of the infinitive phrase after a verb like "want". What is the theta role of "to go to the store" in "I want to go to the store." The choices made ranged all over the map from PROPOSITION to PREDICATE to GOAL to THEME.

Another question was about prepositional phrases / clauses, like "starting at 10 am". We decided that TIME was the appropriate role, and thus a clause might have more than one TIME role. Then, the question was what role does "10 am" play with respect to starting. Again the answer seemed to be TIME, though that was less clear.

2. What to do about really vague senses. There are some mikrokosmos senses that are very vague. This is in part due to a problem with CRL's algorithm that assigns concept nodes to lexical items. We suggested that if the sense was way too general for the concept that it should not be chosen. We may have to make some special arrangements for these particular mikrokosmos nodes, that show up, for example, with adjectives.

3. It was emphasized that annotators ought to make a conscious decision to code that there is no mK or wn sense appropriate, so that annotators need to check "dummy concept (wn)" or "dummy concept (mK)" when there is no appropriate sense of the right kind.

4. It was noted by several annotators that with the multiple-choice approach it was more difficult to maintain consistency over the same word appearing multiple times in a text. When only a single choice was allowed, it was easier to remember that choice and use it again. A device whereby it would be easy to copy annotations or to view previous annotations would be helpful.

5. A question was raised about annotating compound nouns. The example was the phrase "phone number" which required (as our current instructions go) two annotations, one for "phone" and one for "number". The issue was that there is a sense in Omega for "number/phone number". Some annotators annotated both "phone" and "number" with this sense. Others annotated "phone" as "telephone" and "number" as "phone number". This seemed to most the prefereable way to code. This also raised the question of other more vague, but possibly applicable senses should be coded. Clearly the "phone number" sense is exactly right in the context, but other senses might be appropriate, like "set of digits" etc., particularly since we weren't coding "phone number" directly. We suggested that if there was one _clearly correct_ sense, choose that one alone, even if others were possibly applicable. The idea behind the multiple choice was to not make annotators decide between two equally applicable senses. If there's one that's clearly right and others that would be acceptable, go with the clearly right one.

6. Another situation that arose was cases where the annotators were simply unable to determine what was being talked about at all. This may be due to translation problems. For example,there was a reference to "discrimination number" in one article. No one could quite figure out what that was and so annotating for sense was difficult. One annotator found synonyms at "discrimination" Others coded "dummy concept". The suggested answer was that in such cases, if you really don't understand what's going on, code "dummy concepts" and then make a comment about the reason for your choice.

In general, comments may be made on any lexical item. If you click on "Edit Role Assignments" (even for words that you are not assigning roles to (like adjectives, nouns, and adverbs), there is a line for Comments.

7. Another question was raised about English "light verbs" (this is a term used in Persian grammar), like "make" or "do" or "have" where there is not much _meaning_ to the verb, but much of the meaning derives from the remainder of the sentence. "doing lunch" is a lot different from "doing hair" or "doing the fox trot" and "making a choice" is a lot different from "making lunch". So there are lots of senses for these verbs, and which one is the idiomatic sense you're faced with is not always clear. We didn't have a way out of this predicament at the moment. We might want to give some consideration to some solution.


Version 1, Mon 16 Feb 2004 17:30:05 [Helmreich]
Edit | History | Changes Home page | Site map | Search | Recent changes | Help