Edit | History | Changes Home page | Site map | Search | Recent changes | Help

03-29-04 Notes
annotator phone call

Here is a quick run-down of topics covered in the phone call:

1) Nizar brought up his look at whether annotators finished files or coded words that were supposed to be coded. He suggested that each annotator check out the file which records those results.

2) Some problems with the tool were raised:

a) some red lines should be green -- this seemed to be mostly when more than one copy of a sense was chosen. The tool took this to be difference, when it was really identical.

b) some green lines should be red -- this seemed to be when there was overlap but not identity between the two coders -- Scott at CMU will find a few examples and forward them to Nizar.

c) there seemed to be some difference in behavior of the tool under IE and Mozilla.

3) It was reiterated that coders should complete all their annotation work before beginning reconciliation work. However, completed files should be posted to the wiki, without removing the initial incomplete files. This is for our evaluation purposes.

4) Questions were raised about the reconciliation process -- some were issues that had been raised before. Others were new:

a) How important is it to follow the part-of-speech identity between word to be annotated and the pos label for the senses. Answer: keep the identity if at all possible. However, there can be mistakes in the parse tree of the input text, or it may be that there is a morphological shift that is not recognized in the Omega -- for example, almost any adjective can become a noun: rich --> the rich. This may not be reflected in Omega.

Suggestion: it would be nice in the reconciliation tool to indicate both the POS of the text word and also the POS assigned to each sense. (This latter information is probably not in Tred.) This would help make sure that the identity (or non-identity) of POS's was known.

b) How important is it to stick with the automatic query to Omega generated by Tiamat? Answer: As above, try to work with the initial list provided by Tiamat. Only if no sense is adequate in the original list should you try to generate a different query. Of course, if the original word is not found, you will have to do this. Ways of doing this are suggested in the manual.

c) Coding of English "light" verbs -- "do", "have", "make" -- was often difficult, due to the large number of quite different senses.

d) There is no need for the final vote to be identical -- annotators may still disagree for a number of reasons.

e) You may not like any of the choices available at reconciliation. You may vote "No" on all senses if that is the case.

Suggestion: it would be nice to have a "comment" section where a new choice could be made, or other comments.


Version 1, Wed 31 Mar 2004 03:29:38 [Helmreich]
Edit | History | Changes Home page | Site map | Search | Recent changes | Help