大型语言模型在零拍摄设置中的许多自然语言处理(NLP)任务中表现出令人印象深刻的性能。我们询问这些模型是否展示了致辞语言 - NLP应用的关键组成部分 - 通过评估四个偶数基准的模型。我们发现大型语言模型的令人印象深刻的零射击性能主要是由于我们的基准测试中的数据集偏差。我们还表明,零拍摄性能对基准的超参数和相似性敏感到预训练数据集。此外,当在几次拍摄设置中评估模型时,我们没有观察大量改进。最后,与以前的工作相比,我们发现利用明确的致辞知识并没有产生重大改善。
translated by 谷歌翻译
After just a few hundred training updates, a standard probabilistic model for language generation has likely not yet learnt many semantic or syntactic rules of natural language, which inherently makes it difficult to estimate the right probability distribution over next tokens. Yet around this point, these models have identified a simple, loss-minimising behaviour: to output the unigram distribution of the target training corpus. The use of such a crude heuristic raises the question: Rather than wasting precious compute resources and model capacity for learning this strategy at early training stages, can we initialise our models with this behaviour? Here, we show that we can effectively endow our model with a separate module that reflects unigram frequency statistics as prior knowledge. Standard neural language generation architectures offer a natural opportunity for implementing this idea: by initialising the bias term in a model's final linear layer with the log-unigram distribution. Experiments in neural machine translation demonstrate that this simple technique: (i) improves learning efficiency; (ii) achieves better overall performance; and (iii) appears to disentangle strong frequency effects, encouraging the model to specialise in non-frequency-related aspects of language.
translated by 谷歌翻译
体育运动员的转会费已成为天文学。这是因为将具有巨大未来价值的球员带给俱乐部对于他们的生存至关重要。我们介绍了一个案例研究,该案例研究基于FIFA数据分析,影响世界顶级足球运动员的转移费用。为了预测每个玩家的市场价值,我们通过使用树结构化的Parzen估计量(TPE)算法优化其超参数来提出改进的LightGBM模型。我们通过Shapley添加说明(SHAP)算法确定突出特征。已提出的方法已与基线回归模型(例如线性回归,拉索,弹性净,内核脊回归)和没有超参数优化的梯度增强模型进行了比较。与回归基线模型,GBDT和LightGBM模型相比,优化的LightGBM模型平均表现出的出色精度约为3.8、1.4和1.8倍。我们的模型在确定未来招募足球俱乐部应考虑的属性方面提供了解释性。
translated by 谷歌翻译
We introduce Transformer Grammars (TGs), a novel class of Transformer language models that combine (i) the expressive power, scalability, and strong performance of Transformers and (ii) recursive syntactic compositions, which here are implemented through a special attention mask and deterministic transformation of the linearized tree. We find that TGs outperform various strong baselines on sentence-level language modeling perplexity, as well as on multiple syntax-sensitive language modeling evaluation metrics. Additionally, we find that the recursive syntactic composition bottleneck which represents each sentence as a single vector harms perplexity on document-level language modeling, providing evidence that a different kind of memory mechanism -- one that is independent of composed syntactic representations -- plays an important role in current successful models of long text.
translated by 谷歌翻译
我们建议展开沉浸式远程呈现机器人的用户所经历的轮换,以改善用户的舒适度并减少VR疾病。通过沉浸式远程呈现,我们指的是移动机器人顶部的360 \ TextDegree〜相机的情况将视频和音频流入遥远用户遥远的远程用户佩戴的头戴式展示中。因此,它使得用户能够在机器人的位置处存在,通过转动头部并与机器人附近的人进行通信。通过展开相机框架的旋转,当机器人旋转时,用户的观点不会改变。用户只能通过在其本地设置中物理旋转来改变她的观点;由于没有相应的前庭刺激的视觉旋转是VR疾病的主要来源,预计用户的物理旋转将减少VR疾病。我们实现了展开遍历虚拟环境的模拟机器人的旋转,并将用户学习(n = 34)进行比较,将展开旋转与机器人转弯时的ViewPoint转向。我们的研究结果表明,用户发现更优选且舒适的展开转动,并降低了他们的VR疾病水平。我们还进一步提出了关于用户路径集成功能,观看方向和机器人速度和距离的主观观察到模拟人员和对象的结果。
translated by 谷歌翻译