The role of semantics, pre-emption and skew in linguistic distributions: the case of the un-construction.
The role of semantics, pre-emption and skew in linguistic distributions: the case of the un-construction.
Blog Article
We use the Google Ngram database, Nail File a corpus of 5,195,769 digitized books containing ~4% of all books ever published, to test three ideas that are hypothesized to account for linguistic generalizations: verbal semantics, pre-emption and skew.Using 828,813 tokens of un-forms as a test case for these mechanisms, we found verbal semantics was a good predictor of the frequency of un-forms in the English language over the past 200 years – both in terms of how the frequency changed over time and their rank frequency.We did not find strong evidence for the direct competition Sandalwood of un-forms and their top pre-emptors, however the skew of the un-construction competitors was inversely correlated with the acceptability of the un-form.
We suggest a cognitive explanation for this, namely, that the more the set of relevant pre-emptors is skewed then the more easily it is retrieved from memory.This suggests that it is not just the frequency of pre-emptive forms that must be taken into account when trying to explain usage patterns but their skew as well.