Boolean query construction is often critical for medical systematic review literature search. To create an effective Boolean query, systematic review researchers typically spend weeks coming up with effective query terms and combinations. One challenge to creating an effective systematic review Boolean query is the selection of effective MeSH Terms to include in the query. In our previous work, we created neural MeSH term suggestion methods and compared them to state-of-the-art MeSH term suggestion methods. We found neural MeSH term suggestion methods to be highly effective. In this demonstration, we build upon our previous work by creating (1) a Web-based MeSH term suggestion prototype system that allows users to obtain suggestions from a number of underlying methods and (2) a Python library that implements ours and others' MeSH term suggestion methods and that is aimed at researchers who want to further investigate, create or deploy such type of methods. We describe the architecture of the web-based system and how to use it for the MeSH term suggestion task. For the Python library, we describe how the library can be used for advancing further research and experimentation, and we validate the results of the methods contained in the library on standard datasets. Our web-based prototype system is available at http://ielab-mesh-suggest.uqcloud.net, while our Python library is at https://github.com/ielab/meshsuggestlib.
translated by 谷歌翻译
高质量的医学系统评价需要全面的文献搜索,以确保建议和结果足够可靠。确实,寻找相关的医学文献是构建系统评价的关键阶段,并且通常涉及域(医学研究人员)和搜索(信息专家)专家,以开发搜索查询。基于布尔逻辑,在这种情况下的查询非常复杂,包括标准化术语(例如,医学主题标题(网格)词库)的自由文本项和索引项,并且难以构建。特别是显示网格术语的使用可以提高搜索结果的质量。但是,确定正确的网格术语以在查询中包含很难:信息专家通常不熟悉网格数据库,并且不确定查询网格条款的适当性。自然地,网格术语的全部价值通常不会完全利用。本文研究了基于仅包含自由文本项的初始布尔查询提出网格术语的方法。在这种情况下,我们设计了基于语言模型的词汇和预训练的方法。这些方法有望自动识别高效的网格术语,以包含在系统的审查查询中。我们的研究对几种网格术语建议方法进行了经验评估。我们进一步对每种方法的网格项建议进行了广泛的分析,以及这些建议如何影响布尔查询的有效性。
translated by 谷歌翻译
Medical systematic reviews typically require assessing all the documents retrieved by a search. The reason is two-fold: the task aims for ``total recall''; and documents retrieved using Boolean search are an unordered set, and thus it is unclear how an assessor could examine only a subset. Screening prioritisation is the process of ranking the (unordered) set of retrieved documents, allowing assessors to begin the downstream processes of the systematic review creation earlier, leading to earlier completion of the review, or even avoiding screening documents ranked least relevant. Screening prioritisation requires highly effective ranking methods. Pre-trained language models are state-of-the-art on many IR tasks but have yet to be applied to systematic review screening prioritisation. In this paper, we apply several pre-trained language models to the systematic review document ranking task, both directly and fine-tuned. An empirical analysis compares how effective neural methods compare to traditional methods for this task. We also investigate different types of document representations for neural methods and their impact on ranking performance. Our results show that BERT-based rankers outperform the current state-of-the-art screening prioritisation methods. However, BERT rankers and existing methods can actually be complementary, and thus, further improvements may be achieved if used in conjunction.
translated by 谷歌翻译
When designing a new API for a large project, developers need to make smart design choices so that their code base can grow sustainably. To ensure that new API components are well designed, developers can learn from existing API components. However, the lack of standardized method for comparing API designs makes this learning process time-consuming and difficult. To address this gap we developed the API-Spector, to the best of our knowledge one of the first API-to-API specification recommendation engines. API-Spector retrieves relevant specification components written in OpenAPI (a widely adopted language used to describe web APIs). API-Spector presents several significant contributions, including: (1) novel methods of processing and extracting key information from OpenAPI specifications, (2) innovative feature extraction techniques that are optimized for the highly technical API specification domain, and (3) a novel log-linear probabilistic model that combines multiple signals to retrieve relevant and high quality OpenAPI specification components given a query specification. We evaluate API-Spector in both quantitative and qualitative tasks and achieve an overall of 91.7% recall@1 and 56.2% F1, which surpasses baseline performance by 15.4% in recall@1 and 3.2% in F1. Overall, API-Spector will allow developers to retrieve relevant OpenAPI specification components from a public or internal database in the early stages of the API development cycle, so that they can learn from existing established examples and potentially identify redundancies in their work. It provides the guidance developers need to accelerate development process and contribute thoughtfully designed APIs that promote code maintainability and quality.
translated by 谷歌翻译
使用计算笔记本(例如,Jupyter Notebook),数据科学家根据他们的先前经验和外部知识(如在线示例)合理化他们的探索性数据分析(EDA)。对于缺乏关于数据集或问题的具体了解的新手或数据科学家,有效地获得和理解外部信息对于执行EDA至关重要。本文介绍了eDassistant,一个jupyterlab扩展,支持EDA的原位搜索示例笔记本电脑和有用的API的推荐,由搜索结果的新颖交互式可视化供电。代码搜索和推荐是由最先进的机器学习模型启用的,培训在线收集的EDA笔记本电脑的大型语料库。进行用户学习,以调查埃迪卡斯特和数据科学家的当前实践(即,使用外部搜索引擎)。结果证明了埃迪斯坦特的有效性和有用性,与会者赞赏其对EDA的顺利和环境支持。我们还报告了有关代码推荐工具的几种设计意义。
translated by 谷歌翻译
庞大的科学出版物呈现出越来越大的挑战,找到与给定的研究问题相关的那些,并在其基础上做出明智的决定。如果不使用自动化工具,这变得非常困难。在这里,一个可能的改进区域是根据其主题自动分类出版物摘要。这项工作介绍了一种新颖的知识基础的出色出版物分类器。该方法侧重于实现可扩展性和对其他域的容易适应性。在非常苛刻的食品安全领域,分类速度和准确度被证明是令人满意的。需要进一步发展和评估该方法,因为所提出的方法显示出很大的潜力。
translated by 谷歌翻译
当医学研究人员进行系统审查(SR)时,筛查研究是最耗时的过程:研究人员阅读了数千个医学文献,手动标记它们相关或无关紧要。筛选优先级排序(即,文件排名)是通过提供相关文件的排名来协助研究人员的方法,其中相关文件的排名高于无关。种子驱动的文档排名(SDR)使用已知的相关文档(即,种子)作为查询并生成这些排名。以前的SDR工作试图在查询文档中识别不同术语权重,并在检索模型中使用它们来计算排名分数。或者,我们将SDR任务制定为查询文档的类似文档,并根据相似度得分生成排名。我们提出了一个名为Mirror匹配的文件匹配度量,通过结合常见的书写模式来计算医疗摘要文本之间的匹配分数,例如背景,方法,结果和结论。我们对2019年克利夫氏素母电子邮件进行实验2 TAR数据集,并且经验结果表明这种简单的方法比平均精度和精密的度量标准的传统和神经检索模型实现了更高的性能。
translated by 谷歌翻译
在过去的几十年中,研究人员已经付出了许多努力,调查用于排名在信息检索过程中检索到的查询结果的排名技术,或在推荐系统中对推荐产品进行排名。在该项目中,我们旨在调查搜索,排名以及建议技术,以帮助实现大学学术界搜索平台。与通常的信息检索方案不同,在我们的情况下,存在许多基础真理排名数据,我们对学术界排名的基础真相知识有限。例如,考虑到一些搜索查询,我们只知道一些高度相关的研究人员,因此应该排名最高,对于其他一些搜索查询,我们不知道应该将哪些研究人员排名最高。有限的地面真相数据使一些常规的排名技术和评估指标变得不可行,这是我们在本项目中面临的巨大挑战。该项目可以在很大程度上增强用户的学术搜索经验,有助于实现一个学术搜索平台,其中包括研究人员,出版物和研究信息领域,这不仅对大学学院,而且对学生的研究经验都有益。
translated by 谷歌翻译
随着能够在不同用户上下文(例如,移动中的用户)操作的智能系统的需求不断增长,因此,该系统对用户需要的正确解释对于对用户查询的答案提供了一致的答案至关重要。用于解决此类任务的最有效技术是在自然语言处理和术语语义扩展的领域中。这样的系统旨在估计输入查询的实际含义,以解决用户问题中表达的单词的概念。本文的目的是证明哪种语义关系在基于语义扩展的检索系统中影响最大的,并确定在结合此类关系时的准确性和噪声引入之间的最佳权衡。评估使得构建一个简单的自然语言处理系统,能够查询任何分类驱动的领域,从而利用不同语义扩展作为知识资源的组合。拟议的评估采用广泛而多样的分类法作为用例,利用其标签作为扩展的基础。为了建立知识资源,已经生产并集成了几个语料库,并将其集成到NLP基础架构中,目的是估算与分类学标签相对应的伪征值,被认为是可能的意图。
translated by 谷歌翻译
知道如何在搜索引擎(SES)(例如Google或Wikipedia)中构建基于文本的搜索查询(SQS)已成为一项基本技能。尽管可以通过此类SE提供大量数据,但大多数结构化数据集都生活在其范围之外。可视化工具有助于这一限制,但是没有这样的工具接近通过通用SES获得的大量信息。为了填补这一空白,本文介绍了Q4EDA,这是一个新颖的框架,可转换用户在时间序列的视觉表示上执行的视觉选择查询,提供有效且稳定的SQS,可用于通用SES和相关信息的建议。用户通过将Gapminder的线条复制品与填充有Wikipedia文档的SE联系起来的应用程序来介绍和验证Q4EDA的实用性,并显示了Q4EDA如何支持和增强联合国世界指标的探索性分析。尽管有一些局限性,Q4EDA在其建议中仍然是独一无二的,它代表了提供基于用户与视觉表示的用户交互来查询文本信息的解决方案的真正进步。
translated by 谷歌翻译
机器学习源代码(MLONCODE)是一项流行的研究领域,该研究领域是由大规模代码存储库的可用性和开发挖掘源代码的强大概率和深度学习模型驱动的流行研究领域。代码到代码建议是MLONCODE中的任务,旨在推荐相关的,不同和简洁的代码片段,这些代码代码代码代码代码段可以在其开发环境(IDE)中使用开发人员编写的代码扩展。代码代码推荐引擎通过减少IDE切换和增加代码重用,保持提高开发人员生产力的承诺。现有的代码代码推荐引擎不会优雅地扩展到大的CodeBases,在代码存储库大小增加时,展示查询时间的线性增长。此外,现有的代码代码推荐引擎未能考虑排名函数中的代码存储库的全局统计信息,例如代码片段长度的分发,导致子最优检索结果。我们通过\ emph {senatus}来解决这两个弱点,这是一个新的代码代码推荐引擎。在SeNatus的核心是\ emph {de-skew} lsh一个新的局部敏感散列(lsh)算法,其索引快速(子线性时间)检索数据,同时使用新颖的抽象语法抵消片段长度分布中的偏差基于树的特征评分和选择算法。我们通过自动评估和专家开发人员用户学习评估SENATU,并发现该建议具有比竞争基线更高的质量,同时实现更快的搜索。例如,在CodeSearchNet DataSet上,我们显示SeNatus通过6.7 \%F1提高性能,并且与Facebook Aroma对代码到代码建议的任务相比,Query Time 16x更快。
translated by 谷歌翻译
自动问题应答(QA)系统的目的是以时间有效的方式向用户查询提供答案。通常在数据库(或知识库)或通常被称为语料库的文件集合中找到答案。在过去的几十年里,收购知识的扩散,因此生物医学领域的新科学文章一直是指数增长。因此,即使对于领域专家,也难以跟踪域中的所有信息。随着商业搜索引擎的改进,用户可以在某些情况下键入其查询并获得最相关的一小组文档,以及在某些情况下从文档中的相关片段。但是,手动查找所需信息或答案可能仍然令人疑惑和耗时。这需要开发高效的QA系统,该系统旨在为用户提供精确和精确的答案提供了生物医学领域的自然语言问题。在本文中,我们介绍了用于开发普通域QA系统的基本方法,然后彻底调查生物医学QA系统的不同方面,包括使用结构化数据库和文本集合的基准数据集和几种提出的方​​法。我们还探讨了当前系统的局限性,并探索潜在的途径以获得进一步的进步。
translated by 谷歌翻译
Logic Mill is a scalable and openly accessible software system that identifies semantically similar documents within either one domain-specific corpus or multi-domain corpora. It uses advanced Natural Language Processing (NLP) techniques to generate numerical representations of documents. Currently it leverages a large pre-trained language model to generate these document representations. The system focuses on scientific publications and patent documents and contains more than 200 million documents. It is easily accessible via a simple Application Programming Interface (API) or via a web interface. Moreover, it is continuously being updated and can be extended to text corpora from other domains. We see this system as a general-purpose tool for future research applications in the social sciences and other domains.
translated by 谷歌翻译
收集与特定API方法相关的API示例,用法和提及在诸如堆栈溢出之类的场地上的讨论中不是一个微不足道的问题。它需要努力正确认识讨论是否指的是开发人员/工具正在搜索的API方法。线程的内容包括描述API方法在讨论中的参与和包含API调用的代码片段中的文本段落,可以参考给定的API方法。利用此观察,我们开发FacOS,一种特定于背景算法,可以在讨论中捕获段落和代码片段的语义和语法信息。FACOS将基于语法的单词的分数与来自Codebert的精细调整的预测模型的分数相结合。Facos在F1分数方面将最先进的方法击败了13.9%。
translated by 谷歌翻译
近年来,由于通过网络的电子文件的高可用性,抄袭已成为一个严峻的挑战,特别是学者之间。已经开发出各种抄袭检测系统来防止文本重复使用和面对抄袭。虽然在学术手稿中检测重复文本几乎很容易,但发现已经语义改变的文本重复模式具有重要意义。另一个重要问题是处理较少的资源语言,这些语言有很多文本,用于训练目的,以及NLP应用程序的工具中的性能很低。在本文中,我们介绍了Hamtajoo,是学术稿件的波斯抄袭检测系统。此外,我们描述了系统的整体结构以及每个阶段中使用的算法。为了评估所提出的系统的性能,我们使用了抄袭检测语料库符合PAN标准。
translated by 谷歌翻译
最近,几种密集的检索(DR)模型已经证明了在搜索系统中无处不在的基于术语的检索的竞争性能。与基于术语的匹配相反,DR将查询和文档投影到密集的矢量空间中,并通过(大约)最近的邻居搜索检索结果。部署新系统(例如DR)不可避免地涉及其性能方面的权衡。通常,建立的检索系统按照效率和成本(例如查询延迟,索引吞吐量或存储要求)对其进行了良好的理解。在这项工作中,我们提出了一个具有一组标准的框架,这些框架超出了简单的有效性措施,可以彻底比较两个检索系统,并明确目标是评估一个系统的准备就绪,以取代另一个系统。这包括有效性和各种成本因素之间的仔细权衡考虑。此外,我们描述了护栏标准,因为即使是平均而言更好的系统,也可能会对少数查询产生系统性故障。护栏检查某些查询特性和新型故障类型的故障,这些故障仅在密集检索系统中才有可能。我们在网络排名方案上演示了我们的决策框架。在这种情况下,最先进的DR模型的结果令人惊讶,不仅是平均表现,而且通过一系列广泛的护栏测试,表现出不同的查询特性,词汇匹配,概括和回归次数的稳健性。无法预测将来博士是否会变得无处不在,但是这是一种可能的方法是通过重复应用决策过程(例如此处介绍的过程)。
translated by 谷歌翻译
The number of scientific publications continues to rise exponentially, especially in Computer Science (CS). However, current solutions to analyze those publications restrict access behind a paywall, offer no features for visual analysis, limit access to their data, only focus on niches or sub-fields, and/or are not flexible and modular enough to be transferred to other datasets. In this thesis, we conduct a scientometric analysis to uncover the implicit patterns hidden in CS metadata and to determine the state of CS research. Specifically, we investigate trends of the quantity, impact, and topics for authors, venues, document types (conferences vs. journals), and fields of study (compared to, e.g., medicine). To achieve this we introduce the CS-Insights system, an interactive web application to analyze CS publications with various dashboards, filters, and visualizations. The data underlying this system is the DBLP Discovery Dataset (D3), which contains metadata from 5 million CS publications. Both D3 and CS-Insights are open-access, and CS-Insights can be easily adapted to other datasets in the future. The most interesting findings of our scientometric analysis include that i) there has been a stark increase in publications, authors, and venues in the last two decades, ii) many authors only recently joined the field, iii) the most cited authors and venues focus on computer vision and pattern recognition, while the most productive prefer engineering-related topics, iv) the preference of researchers to publish in conferences over journals dwindles, v) on average, journal articles receive twice as many citations compared to conference papers, but the contrast is much smaller for the most cited conferences and journals, and vi) journals also get more citations in all other investigated fields of study, while only CS and engineering publish more in conferences than journals.
translated by 谷歌翻译
The text-to-image model Stable Diffusion has recently become very popular. Only weeks after its open source release, millions are experimenting with image generation. This is due to its ease of use, since all it takes is a brief description of the desired image to "prompt" the generative model. Rarely do the images generated for a new prompt immediately meet the user's expectations. Usually, an iterative refinement of the prompt ("prompt engineering") is necessary for satisfying images. As a new perspective, we recast image prompt engineering as interactive image retrieval - on an "infinite index". Thereby, a prompt corresponds to a query and prompt engineering to query refinement. Selected image-prompt pairs allow direct relevance feedback, as the model can modify an image for the refined prompt. This is a form of one-sided interactive retrieval, where the initiative is on the user side, whereas the server side remains stateless. In light of an extensive literature review, we develop these parallels in detail and apply the findings to a case study of a creative search task on such a model. We note that the uncertainty in searching an infinite index is virtually never-ending. We also discuss future research opportunities related to retrieval models specialized for generative models and interactive generative image retrieval. The application of IR technology, such as query reformulation and relevance feedback, will contribute to improved workflows when using generative models, while the notion of an infinite index raises new challenges in IR research.
translated by 谷歌翻译
两个关键假设塑造了排名检索的通常视图:(1)搜索者可以为他们希望看到的文档中的疑问选择单词,并且(2)排名检索的文档就足以,因为搜索者将足够就足够了能够认识到他们希望找到的那些。当要搜索的文档处于搜索者未知的语言时,既不是真的。在这种情况下,需要跨语言信息检索(CLIR)。本章审查了艺术技术的交流信息检索,并概述了一些开放的研究问题。
translated by 谷歌翻译
社交媒体有可能提供有关紧急情况和突然事件的及时信息。但是,在每天发布的数百万帖子中找到相关信息可能很困难,并且开发数据分析项目通常需要时间和技术技能。这项研究提出了一种为分析社交媒体的灵活支持的方法,尤其是在紧急情况下。引入了可以采用社交媒体分析的不同用例,并讨论了从大量帖子中检索信息的挑战。重点是分析社交媒体帖子中包含的图像和文本,以及一组自动数据处理工具,用于过滤,分类和使用人类的方法来支持数据分析师的内容。这种支持包括配置自动化工具的反馈和建议,以及众包收集公民的投入。通过讨论Crowd4SDG H2020欧洲项目中开发的三个案例研究来验证结果。
translated by 谷歌翻译