- The first text took 1 1/2 to 2 times as long to annotate as the last text.
- It would help out a lot if, when we annotated a certain word (such as "rate" from interest rate) within a text, it would annotate all such words within the text. That would save a lot of time and assure consistency within my own annotations.
- Having the texts up the day before we were to start was a great help (when it happened).
- The biggest concern I have, by far, is when choosing the roles. Further explanations, better examples, or not doing them at all would be wonderful.
- I think while there is value in selecting any words that would fit, it would be better to have some limit. Either we select the ONE word we think is best (if we had to choose), or the top 2, or rank the words 1-3 in order of what we would choose, or something like that.
Tim Hackman, Univ. of MD:
- I agree that the theta roles are the most confusing part. It seems I only used a few out of the whole list (agent, theme, modifier, and occasionally perceived or possessed). The role definitions and examples were helpful sometimes, but in practice many situations weren't covered by them.
- Another theta role concern--it seems there are lots of unconnected words which are identified as needing roles (i.e., as connected to the word I'm annotating). Many of these are prepositions, connective words, and sometimes even punctuation.
- Using both MK and WN definitions is also confusing at times, as they seem to be concerned with very different things. WN definitions are very context-specific, while MK seems to deal largely with top-level concepts. For example, the most specific MK concept for "representative" is "HUMAN," while WN has half a dozen or more possible definitions.
- Sometimes difficult to differentiate between WN definitions, especially when there are multiple identical concepts where only the theta role examples are different. (Read, for example, has 6 labeled read<talk) In this case are all 6 correct, or do we just use the one with the closest theta role match?
- The use of <pro> is confusing. Should we assign them theta roles based on the word they are representing? Or should they not get roles at all? (For example, should the <pro> be given a subject role (agent, etc.) in a sentence without an explicit subject?)
- More guidance on dummy concepts was also needed, maybe a standard format and a preferred online dictionary?
Maria Pilar Milagros Garcia, CRL:
- I also agree that the theta roles are the most confusing part. The definitions and examples given were helpful most of the times, but it would have been better if all of the omega entries were given theta roles. Is there a reason why verbs such as "continue" in the last text should not have theta roles assigned to all the omega definitions? Who decides which entries come with theta roles assigned and which ones don't?
- It would be really helful if, when we annotate a word that appears later on in a text, we could save our choices or, even better, that all the words were automatically annotated once we have annotated it once. As Jeff said, that would indeed assure consistency within our own annotations. Is there a possibility that such a feature be included in a revised version of Tiamat?
- I was also concerned about some of the words needing roles. Some words that, in my opinion, needed to be assigned a role were not even underlined and, surprisingly, punctuation and connectors were. I apologize for not being able to quote specific examples, though I recall having those concerns especially in the last text.
- Sometimes, the translations themselves posed problems. I only recall two problematic examples: "discrimination numbers" in the Korean text and "drought" vs "famine" or "2%" vs "5%" in the Hindi text. In the first case, the Korean translation, I had problems trying to understand what "discrimination" meant. In the latter example, the two sets of translations were different from each other, so I infered that one of the translations was wrong. Who decides which translations to use? Does someone make sure that the translations are close to the original text before we actually annotate them?
Jad From UMD
- I have a concern about the 'slow connection' version of Tiamat. Does it return the same values (terms) as the 'fast connection'? I think this would make a big difference and threaten inter-annotator-reliability.
- On the same topic above, I am concerned that when using the 'slow connection' version of Tiamtat, I get more WN words and definitions returned that MK's. In addition, almost most of the time the MK words have {do definition availble} or something of that sort. This resulted in me choosing less MKs and more WNs.
- When we use DummyConcept, is there a certain criteria to put in the information?
- I still think it would be more efficient and practical to be able to view the detailed information of the already annotated words in Tiamat. This would help me be more consistent in annotating the same words or meanings.
- I also agree with the others that theta roles are the most difficult, confusing and time consuming.
Version 8, Wed 31 Mar 2004 10:11:13 [Jad] - created Mon 01 Mar 2004 13:06:07 [Jeff Pomeroy]