keyboard_arrow_up
Word Predictability is Based on Context - and/or Frequency

Authors

Rodolfo Delmonte1 and Nicolò Busetto2, 1Ca Foscari University, Venice (Italy), 2Accenture TTS Computational Linguistics

Abstract

In this paper we present an experiment carried out with BERT on a small number of Italian sentences taken from two domains: newspapers and poetry domain. They represent two levels of increasing difficulty in the possibility to predict the masked word that we intended to test. The experiment is organized on the hypothesis of increasing difficulty in predictability at the three levels of linguistic complexity that we intend to monitor: lexical, syntactic and semantic level. To test this hypothesis we alternate canonical and non-canonical versions of the same sentence before processing them with the same DL model. The result shows that DL models are highly sensitive to presence of non-canonical structures and to local non-literal meaning compositional effect. However, DL are also very sensitive to word frequency by predicting preferentially function vs content words, collocates vs infrequent word phrases. To measure differences in performance we created a linguistically based “predictability parameter” which is highly correlated with a cosine based classification but produces better distinctions between classes.

Keywords

Deep Learning Models, BERT, Masked Word Task, Word Embeddings, Canonical vs Noncanonical sentence structures, Frequency Ranking, Dictionary of Wordforms, Linguistic Similarity Measures, Predictability Parameter.

Full Text  Volume 12, Number 18