Edit | History | Changes Home page | Site map | Search | Recent changes | Help

Sunday Notes

NP Exercise

what does np exercies show?

today, this year -- 1 or two concepts?

Jap: one word for "today", for "this week", for "this year"?

2 ways: consistent: units + deictic unit, or always one

inconsistent: concept for today, then composition

ontology: has general mechanism, and some "shortcuts" such as "today"

bigram info (ISI): what MI cutoff?

eh: middle ground: metaphorical meaning of one N -> non-compositional

Japanese: some cases of 1 word J, 2 words E (pattern A plus bad harvest)

Arabic:

Problem for Arabic: only synt issues, but prepositions must go in IL2

"e-commerce" type examples, become 2 words

Crown Prince <-> wli (local saint, successor) 3ld (Age) non-compositional in A --> crucial case

issue: compositional in language A as foreign term inlanguage B ("joint venture" in Spanish)

issue: mistranslations

Hindi:

nothing interesting, but maybe effect of genre; issue of mistranslations

Korean:

service quality example: mistranslation, or inference, or synonyms?

Spanish:

real estate -- inmobilaria

French

bonne partie de l'annee 1993

conjecture

Teruko summary

need way of inputtingsingle concepts force rtain languages where E has only two concepts

Omega: Lessons Learned from NP Exercise

Omega becomes merge of all languages, IL1 of all languages

Ed: should me move in Omega go IL2?

prototypical example: "this year"

2 questions:

1) IL 1 annotation -- what to do when concept is missing? 2) IL 2 annotation -- how is it annotated?

ad 1 -- we must Allow annotators to add concepts in any case; allow them to relate concepts to other concepts; links introduced by annotator : "IS-A" and "unspecified"

(note: special issue about "this" vs open-class items)

ad 2 -- topic for Monday

Ontology: General Issues

EH:

Background

1) Ontobank -- moving along well, identifying senses that can be tagged with high IAA, start with PropBank senses; nouns (at NYU): same thing; aim: thousands of words by 2005

Ontobank: driven by lexica IAMTC: drive nby texts

other differnce: of course, multilingual

Idea: as Omega changes, automatically update IAMTC annotations

Cobuild: based on corpus studies

We can use Ontobank experience, but we bring in multilingual experience

Omega aims for 60,000 concepts

ISI: Nick White, doing Omega updating, based on PennPropbank data, stat data, IAMTC feedback

Publications

Workshops at ACL

- Graeme and Sergei on text meaning (Eduard presents on panel) - informal meetings with other projects - multiword expression: no one going - SENSEVAL: no one going!

we need to differentiate ourselves and make somethingunique how? ocnentrate on IL2 issues

multilingual gives us perspective on:

- what is in ontology? - multiword expressions - syntactic divergences/paraphrases - microtheories for different phenomena: scalars, time, epistemic

statuses (modality, aspect,...), etc.

Omega: How to Use with Foreign Language

2 issues

Multilingual WordNets

Should we use them?

Experience: David F used Spanish WN for some workshop

EuroWordnet: 8 langugaes (inc Eng) and interlingual index

-- Digression: Microtheory: number

options: - we can create separate WNs and have interl index - cretae one thing, but keep things different - create one thing, semanticize

Foreign Language IL1 Annotation Process and Omega Extension proposal:

Do both steps:

A) use multilingual WN, lexicon, or other sense inventory: if present, choose if not present, do nothing

B) check if in Omega, if present, choose if not, create bug spec for Omega czar

Implementation question: should list of candidate senses be represented in two steps or in one?

Other issue: should choice of foreign concept entail limitation of Omega concepts if links are present?

Point: using a bilingual dictionary can serve as sense inventory for step A; is this the preferred way? an empirical question, need to determine on the basis of available resources for each language

ISSUE OF CONTENTION: how and whether annotators insert into Omega

proposal: annotator creates spec of new node in Omega, Omega czar entres the new node (later)

crucial question: are foreign WNs indexed into E WN?

Possible objections:

- semanticist: this is just info about a bunch of languages

- linguists: this is kind of interesting, but things are mushed together in unpredicatible ways

- computational people: how do we use this?

Need to address these objections; main response: we have annotated text!

Question: How do annotators know that something is not in Omega? How far do you have to search?

Answer: annotators don't search, tool gives them a list of options, and annotators choose

IL1 and IL2 use the same (evolving) ontology

J: foot-leg no thumb: "thumb" and "foot-leg" E: big toe: "big" and "toe"

Ambiguity vaguess test: John lives near a bank[fin-inst] and Mary does too [river]

multiword expressions: "hot dog"?

At IL0? IL1? ***********UNRESOLVED************

"hot dog" as one node in IL0

pro:

- segmentation issue: cannot insert words between parts

- morphological restructions on dependents

- make it easier on annotators

- why not?

contra:

- practical reasons: interface with as many parsers as possible

- we lose info that may be useful: different MWE lists in two languages with identical string sequences

- morphology in "kick the bucket" -- other languages can have same effect in Adj-N combos

- just a notational difference

- morphological restructions on dependents -- independent on whether it is a MWE or not, irrelevant argument

Arc Labels

Lori presentation

3 groups of cases:

- traditional valency alternations for one verb - FrameNet-type cases - "random" variation (scripts etc)

4 groups of approaches:

- PropBank approach - lcs/localist approach - FrameNet-type - "random" variation (scripts etc)

But: need to distinguish annotation procedure from scientific content

Annotation approaches:

- use lexicon - do not use lexicon

need to decide:

* will corpus be used to study sem role -> synt role mappings? * model interaction with tense and aspect for inferences?

Possible Solution

OMEGA CZAR ACITIVITY

1) Import a chosen set of FrameNet frames into Omega 2) Omega-czar chooses names for all verb meanings that fall under frames 3) Omega-czar chooses LCS-style labels for all other motion metaphor concepts 4) For all remaining concepts, use "dresser""dressee" style labels

ANNOTATOR ACTIVITY

1) Choose concept 2) If a concept does not have a set of argument labels, annotator does nothing and notify Omega czar 3) If there is a set of argument labels, map arguments & adjuncts to labels

Q: does concept BUY still exist once we have concept COMMERCIAL-TRANSACTION? At IL2, should the annotator choose a new concept and new arc labels?

We should look at some more examples.


Version 2, Tue 13 Jul 2004 19:00:01 - created Mon 12 Jul 2004 09:48:14

Edit | History | Changes Home page | Site map | Search | Recent changes | Help