Predictability Effects on Optional Word Omission
T. Florian Jaeger
Stanford University
Ongoing joint work with Roger Levy, Thomas Wasow, and Dave Orr
ABSTRACT
The predictability of a word has been shown to correlate positively with its phonetic reduction (Bell et al., 2003). Surprisingly, little work has been done on the correlation between predictability and word omission (i.e. variations where an entire word can be omitted without leading to ungrammaticality; for an exception; see e,g, Resnik, 1996 on English argument drop). I present a corpus study of predictability effects on such a phenomenon, relativizer omission in non-subject-extracted relative clauses (NSRCs):
(1) I mean the thing [(that) you spray __, you know, out in the field].
Separate sets of multi-factor binary logistic regression analyses were performed on 3,700 NSRCs from speech (Switchboard) and 2,400 NSRCs from written data (Wall Street Journal). For all NSRCs, we calculated the predictability of the NSRC given the head noun it modified (e.g. "thing" in (1)) and the predictability of the NSRC subject (e.g. "you" in (1)) given the head noun. In both spoken and written data, an NSRC's predictability was a significant predictor of relativizer omission. An NSRC subject's predictability was a significant factor, too. We then tested whether the effects would remain significant after controlling for other factors known to affect relativizer omission (taken from Race and MacDonald, 2003; Jaeger et al., 2005). Such effects fall into two groups: (a) effects that are accounted for by existing processing theories and (b) effects that so far have only been described but are not explained by any existing theory. We found that both above-mentioned predictability measures affect relativizer omission above and beyond effects of type (a). Furthermore, we found that predictability seems to subsume some but (by far) not all of the effects of type (b), thereby offering a uniform predictability-based explanation for some of effects unaccounted for by existing models of word omission.
So far, most of our analyses are based on simple bigrams (i.e. P(NSRC | head noun) and P(SBJNSRC | head noun); by head noun, I mean the noun modified by the NSRC, e.g. "thing" in (1)), but I will also present preliminary data from maximum entropy classifiers that we used to estimate NSRCs' predictability given the grammatical function of the modified NP, as well as the NP's determiner, adjective, and head noun.
I will also briefly discuss how the findings relate to existing accounts of sentence production and word omission (Ferreira & Dell, 2000), other word or phrase omission phenomena (Resnik, 1996), as well as to models of word reduction (Jurafsky et al., 2001; Bell et al., 2002, 2003).
=====================================Florian (that's me) is a 4th year Ph.D. student in Linguistics at Stanford University. My background lies in Computer Science, Linguistics, and sentence processing. Most of last year, I've been working on diverse aspects of optional word omission: phonological, syntactic, and processing constraints; social effects; the question whether word omission is subject to audience design and whether its distribution is determined by production complexity. My thesis will focus on the relation between the phonetic reduction of words and the omission of entire words: to which extent is the latter driven just an extreme case of the former (or is that not at all the case)? Before that and during this year, I have also worked (together with others) on wh-phrase ordering in English, predictability effects on comprehension, phonological and phonetic correlates of focus-marking (distinguishing between semantic and pragmatic theories of focus), prosodic phrasing, as well as on syntactic analyses of information structural constraints on Bulgarian object clitic doubling (aha!). For more information on these projects and others, please visit www.stanford.edu/~tiflo/ and/or: let's meet and talk.
|
|
|
For
the colloquium series schedule, see the UMD Computational http://www.umiacs.umd.edu/research/CLIP/colloq/. If you are interested in meeting with the
speaker, please contact Jimmy Lin <http://www.glue.umd.edu/~jimmylin/> Lin (