智能论文笔记

Developing a Knowledge Graph Framework for Pharmacokinetic Natural Product-Drug Interactions

Sanya B. Taneja , Tiffany J. Callahan , Mary F. Paine , Sandra L. Kane-Gill , Halil Kilicoglu , Marcin P. Joachimiak , Richard D. Boyce

分类：人工智能

2022-09-24

当植物天然产物与药物共容纳时，就会发生药代动力学天然产物 - 药物相互作用（NPDIS）。了解NPDI的机制是防止不良事件的关键。我们构建了一个知识图框架NP-KG，作为迈向药代动力学NPDIS的计算发现的一步。 NP-KG是一个具有生物医学本体论，链接数据和科学文献的全文，由表型知识翻译框架和语义关系提取系统，SEMREP和集成网络和动态推理组成的构建的科学文献的全文。通过路径搜索和元路径发现对药代动力学绿茶和kratom-prug相互作用的案例研究评估NP-KG，以确定与地面真实数据相比的一致性和矛盾信息。完全集成的NP-KG由745,512个节点和7,249,576个边缘组成。 NP-KG的评估导致了一致（绿茶的38.98％，kratom的50％），矛盾（绿茶的15.25％，21.43％，Kratom的21.43％），同等和矛盾的（15.25％）（21.43％，21.43％，21.43％ kratom）信息。几种声称的NPDI的潜在药代动力学机制，包括绿茶 - 茶氧化烯，绿茶 - 纳多洛尔，Kratom-Midazolam，Kratom-Quetiapine和Kratom-Venlafaxine相互作用，与已出版的文献一致。 NP-KG是第一个将生物医学本体论与专注于天然产品的科学文献的全文相结合的公斤。我们证明了NP-KG在鉴定涉及酶，转运蛋白和药物的药代动力学相互作用的应用。我们设想NP-KG将有助于改善人机合作，以指导研究人员将来对药代动力学NPDIS进行研究。 NP-KG框架可在https://doi.org/10.5281/zenodo.6814507和https://github.com/sanyabt/np-kg上公开获得。

translated by 谷歌翻译

GraPE: fast and scalable Graph Processing and Embedding

Luca Cappelletti , Tommaso Fontana , Elena Casiraghi , Vida Ravanmehr , Tiffany J. Callahan , Marcin P. Joachimiak , Christopher J. Mungall , Peter N. Robinson , Justin Reese , Giorgio Valentini

分类：机器学习

2021-10-12

图表学习方法为解决图形所代表的复杂的现实世界问题打开了新的可能性。但是，这些应用程序中使用的许多图包括数百万节点和数十亿个边缘，并且超出了当前方法和软件实现的功能。我们提供葡萄，这是一种用于图形处理和表示学习的软件资源，能够通过使用专业和智能数据结构，算法和快速并行实现来通过大图扩展。与最先进的软件资源相比，葡萄显示出经验空间和时间复杂性的数量级的改善，以及边缘预测和节点标签预测性能的实质和统计学上的显着改善。此外，葡萄提供了来自文献和其他来源的80,000多种图，标准化界面允许直接整合第三方库，61个节点嵌入方法，25个推理模型和3个模块化管道，以允许公平且可重复的方法比较以及用于图形处理和嵌入的库。

translated by 谷歌翻译

NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation

Kaustubh D. Dhole , Varun Gangal , Sebastian Gehrmann , Aadesh Gupta , Zhenhao Li , Saad Mahamood , Abinaya Mahendiran , Simon Mille , Ashish Srivastava , Samson Tan

分类：自然语言处理 | 人工智能 | 机器学习

2021-12-06

数据增强是自然语言处理（NLP）模型的鲁棒性评估的重要组成部分，以及增强他们培训的数据的多样性。在本文中，我们呈现NL-Cogmenter，这是一种新的参与式Python的自然语言增强框架，它支持创建两个转换（对数据的修改）和过滤器（根据特定功能的数据拆分）。我们描述了框架和初始的117个变换和23个过滤器，用于各种自然语言任务。我们通过使用其几个转换来分析流行自然语言模型的鲁棒性来证明NL-Upmenter的功效。基础架构，Datacards和稳健性分析结果在NL-Augmenter存储库上公开可用（\ url {https://github.com/gem-benchmark/nl-augmenter}）。

translated by 谷歌翻译

Protein-Ligand Complex Generator & Drug Screening via Tiered Tensor Transform

Jonathan P. Mailoa , Zhaofeng Ye , Jiezhong Qiu , Chang-Yu Hsieh , Shengyu Zhang

分类：神经与进化计算

2023-01-03

Accurate determination of a small molecule candidate (ligand) binding pose in its target protein pocket is important for computer-aided drug discovery. Typical rigid-body docking methods ignore the pocket flexibility of protein, while the more accurate pose generation using molecular dynamics is hindered by slow protein dynamics. We develop a tiered tensor transform (3T) algorithm to rapidly generate diverse protein-ligand complex conformations for both pose and affinity estimation in drug screening, requiring neither machine learning training nor lengthy dynamics computation, while maintaining both coarse-grain-like coordinated protein dynamics and atomistic-level details of the complex pocket. The 3T conformation structures we generate are closer to experimental co-crystal structures than those generated by docking software, and more importantly achieve significantly higher accuracy in active ligand classification than traditional ensemble docking using hundreds of experimental protein conformations. 3T structure transformation is decoupled from the system physics, making future usage in other computational scientific domains possible.

translated by 谷歌翻译

Posterior Collapse and Latent Variable Non-identifiability

Yixin Wang , David M. Blei , John P. Cunningham

分类： (统计)机器学习 | 机器学习

2023-01-02

Variational autoencoders model high-dimensional data by positing low-dimensional latent variables that are mapped through a flexible distribution parametrized by a neural network. Unfortunately, variational autoencoders often suffer from posterior collapse: the posterior of the latent variables is equal to its prior, rendering the variational autoencoder useless as a means to produce meaningful representations. Existing approaches to posterior collapse often attribute it to the use of neural networks or optimization issues due to variational approximation. In this paper, we consider posterior collapse as a problem of latent variable non-identifiability. We prove that the posterior collapses if and only if the latent variables are non-identifiable in the generative model. This fact implies that posterior collapse is not a phenomenon specific to the use of flexible distributions or approximate inference. Rather, it can occur in classical probabilistic models even with exact inference, which we also demonstrate. Based on these results, we propose a class of latent-identifiable variational autoencoders, deep generative models which enforce identifiability without sacrificing flexibility. This model class resolves the problem of latent variable non-identifiability by leveraging bijective Brenier maps and parameterizing them with input convex neural networks, without special variational inference objectives or optimization tricks. Across synthetic and real datasets, latent-identifiable variational autoencoders outperform existing methods in mitigating posterior collapse and providing meaningful representations of the data.

translated by 谷歌翻译

Pseudo-Inverted Bottleneck Convolution for DARTS Search Space

Arash Ahmadian , Yue Fei , Louis S. P. Liu , Konstantinos N. Plataniotis , Mahdi S. Hosseini

分类：机器学习

2022-12-31

Differentiable Architecture Search (DARTS) has attracted considerable attention as a gradient-based Neural Architecture Search (NAS) method. Since the introduction of DARTS, there has been little work done on adapting the action space based on state-of-art architecture design principles for CNNs. In this work, we aim to address this gap by incrementally augmenting the DARTS search space with micro-design changes inspired by ConvNeXt and studying the trade-off between accuracy, evaluation layer count, and computational cost. To this end, we introduce the Pseudo-Inverted Bottleneck conv block intending to reduce the computational footprint of the inverted bottleneck block proposed in ConvNeXt. Our proposed architecture is much less sensitive to evaluation layer count and outperforms a DARTS network with similar size significantly, at layer counts as small as 2. Furthermore, with less layers, not only does it achieve higher accuracy with lower GMACs and parameter count, GradCAM comparisons show that our network is able to better detect distinctive features of target objects compared to DARTS.

translated by 谷歌翻译

Physics-informed Neural Networks approach to solve the Blasius function

Greeshma Krishna , Malavika S Nair , Pramod P Nair , Anil Lal S

分类：机器学习

2022-12-31

Deep learning techniques with neural networks have been used effectively in computational fluid dynamics (CFD) to obtain solutions to nonlinear differential equations. This paper presents a physics-informed neural network (PINN) approach to solve the Blasius function. This method eliminates the process of changing the non-linear differential equation to an initial value problem. Also, it tackles the convergence issue arising in the conventional series solution. It is seen that this method produces results that are at par with the numerical and conventional methods. The solution is extended to the negative axis to show that PINNs capture the singularity of the function at $\eta=-5.69$

translated by 谷歌翻译

Comparative Analysis of Clustering Techniques for Personalized Food Kit Distribution

Jude Francis , Rowan K Baby , Jacob Abraham , Ajmal P. S

分类：机器学习 | (统计)机器学习

2022-12-30

The Government of Kerala had increased the frequency of supply of free food kits owing to the pandemic, however, these items were static and not indicative of the personal preferences of the consumers. This paper conducts a comparative analysis of various clustering techniques on a scaled-down version of a real-world dataset obtained through a conjoint analysis-based survey. Clustering carried out by centroid-based methods such as k means is analyzed and the results are plotted along with SVD, and finally, a conclusion is reached as to which among the two is better. Once the clusters have been formulated, commodities are also decided upon for each cluster. Also, clustering is further enhanced by reassignment, based on a specific cluster loss threshold. Thus, the most efficacious clustering technique for designing a food kit tailored to the needs of individuals is finally obtained.

translated by 谷歌翻译

Symbolic Visual Reinforcement Learning: A Scalable Framework with Object-Level Abstraction and Differentiable Expression Search

Wenqing Zheng , S P Sharan , Zhiwen Fan , Kevin Wang , Yihan Xi , Zhangyang Wang

分类：机器学习 | 人工智能

2022-12-30

Learning efficient and interpretable policies has been a challenging task in reinforcement learning (RL), particularly in the visual RL setting with complex scenes. While neural networks have achieved competitive performance, the resulting policies are often over-parameterized black boxes that are difficult to interpret and deploy efficiently. More recent symbolic RL frameworks have shown that high-level domain-specific programming logic can be designed to handle both policy learning and symbolic planning. However, these approaches rely on coded primitives with little feature learning, and when applied to high-dimensional visual scenes, they can suffer from scalability issues and perform poorly when images have complex object interactions. To address these challenges, we propose \textit{Differentiable Symbolic Expression Search} (DiffSES), a novel symbolic learning approach that discovers discrete symbolic policies using partially differentiable optimization. By using object-level abstractions instead of raw pixel-level inputs, DiffSES is able to leverage the simplicity and scalability advantages of symbolic expressions, while also incorporating the strengths of neural networks for feature learning and optimization. Our experiments demonstrate that DiffSES is able to generate symbolic policies that are simpler and more and scalable than state-of-the-art symbolic RL methods, with a reduced amount of symbolic prior knowledge.

translated by 谷歌翻译

Domain-specific transfer learning in the automated scoring of tumor-stroma ratio from histopathological images of colorectal cancer

Liisa Petäinen , Juha P. Väyrynen , Pekka Ruusuvuori , Ilkka Pölönen , Sami Äyrämö , Teijo Kuopio

分类：计算机视觉 | 机器学习

2022-12-30

Tumor-stroma ratio (TSR) is a prognostic factor for many types of solid tumors. In this study, we propose a method for automated estimation of TSR from histopathological images of colorectal cancer. The method is based on convolutional neural networks which were trained to classify colorectal cancer tissue in hematoxylin-eosin stained samples into three classes: stroma, tumor and other. The models were trained using a data set that consists of 1343 whole slide images. Three different training setups were applied with a transfer learning approach using domain-specific data i.e. an external colorectal cancer histopathological data set. The three most accurate models were chosen as a classifier, TSR values were predicted and the results were compared to a visual TSR estimation made by a pathologist. The results suggest that classification accuracy does not improve when domain-specific data are used in the pre-training of the convolutional neural network models in the task at hand. Classification accuracy for stroma, tumor and other reached 96.1$\%$ on an independent test set. Among the three classes the best model gained the highest accuracy (99.3$\%$) for class tumor. When TSR was predicted with the best model, the correlation between the predicted values and values estimated by an experienced pathologist was 0.57. Further research is needed to study associations between computationally predicted TSR values and other clinicopathological factors of colorectal cancer and the overall survival of the patients.

translated by 谷歌翻译