Today when many practitioners run basic NLP on the entire web andlarge-volume traffic, faster methods are paramount to saving time and energycosts. Recent advances in GPU hardware have led to the emergence ofbi-directional LSTMs as a standard method for obtaining per-token vectorrepresentations serving as input to labeling tasks such as NER (often followedby prediction in a linear-chain CRF). Though expressive and accurate, thesemodels fail to fully exploit GPU parallelism, limiting their computationalefficiency. This paper proposes a faster alternative to Bi-LSTMs for NER:Iterated Dilated Convolutional Neural Networks (ID-CNNs), which have bettercapacity than traditional CNNs for large context and structured prediction.Unlike LSTMs whose sequential processing on sentences of length N requires O(N)time even in the face of parallelism, ID-CNNs permit fixed-depth convolutionsto run in parallel across entire documents. We describe a distinct combinationof network structure, parameter sharing and training procedures that enabledramatic 14-20x test-time speedups while retaining accuracy comparable to theBi-LSTM-CRF. Moreover, ID-CNNs trained to aggregate context from the entiredocument are even more accurate while maintaining 8x faster test time speeds.
translated by 谷歌翻译