Recurrent neural networks (RNNs) have achieved impressive results in avariety of linguistic processing tasks, suggesting that they can inducenon-trivial properties of language. We investigate here to what extent RNNslearn to track abstract hierarchical syntactic structure. We test whether RNNstrained with a generic language modeling objective in four languages (Italian,English, Hebrew, Russian) can predict long-distance number agreement in variousconstructions. We include in our evaluation nonsensical sentences where RNNscannot rely on semantic or lexical cues ("The colorless green ideas I ate withthe chair sleep furiously"), and, for Italian, we compare model performance tohuman intuitions. Our language-model-trained RNNs make reliable predictionsabout long-distance agreement, and do not lag much behind human performance. Wethus bring support to the hypothesis that RNNs are not just shallow-patternextractors, but they also acquire deeper grammatical competence.
translated by 谷歌翻译