智能论文笔记

Biomarker Gene Identification for Breast Cancer Classification

Sheetal Rajpal , Ankit Rajpal , Manoj Agarwal , Naveen Kumar

分类：机器学习

2021-11-10

背景：乳腺癌被出现为妇女中最普遍的癌症之一，导致高死亡率。由于乳腺癌的异质性质，需要鉴定与乳腺癌亚型相关的差异表达基因，以便及时诊断和治疗。目的：鉴定为其签名的四个乳腺癌亚型中每种患有的小基因，本文提出了一种基因签名识别的新算法。方法：本作本作采用可解释的AI方法来研究用于使用TCGA乳腺癌RNA序列数据鉴定生物标志物的亚型神经网络对亚型分类进行的预测。结果：所提出的算法导致了一组43个差异表达基因签名的发现。我们使用神经网络分类器实现了0.91的竞争性平均10倍。此外，基因设定分析显示了若干相关途径，例如ERBB2和P53信号传导途径的GRB7事件。使用Pearson相关矩阵，我们注意到亚型特异性基因在每个亚型内相关。结论：提出的技术使我们能够找到一套简洁和临床相关的基因签名集。

translated by 谷歌翻译

Deep Learning Based Model for Breast Cancer Subtype Classification

Sheetal Rajpal , Virendra Kumar , Manoj Agarwal , Naveen Kumar

分类：机器学习

2021-11-06

乳腺癌长期以来一直是女性死亡率的着名原因。现在，由于能够记录基因表达数据的RNA测序工具的可用性，现在可以进行诊断，治疗和预后。分子亚型与设计设计有关的临床策略和预后密切相关，本文侧重于使用基因表达数据进行乳腺癌分类为四个亚型，即基础，HER2，亮度和叶。在第1阶段，我们建议了一个基于深度学习的模型，它使用AutoEncoder来减少维度。通过使用AutoEncoder，特征集的大小从20,530个基因表达值减少到500。这种编码的表示被传递给第二阶段的深神经网络，用于将患者分为四个分子癌的四种分子亚型。通过部署阶段1和2的组合网络，我们能够在TCGA乳腺癌数据集上获得0.907的平均10倍测试精度。在整个10个不同的运行过程中，所提出的框架相当强劲，如Boxplot用于分类准确性所示。与文献中报告的相关工作相比，我们取得了竞争的结果。总之，所提出的两级深度学习的模型能够准确地分类四个乳腺癌亚型，突出了自动化的能力推导了紧凑的表现和神经网络分类器正确标记乳腺癌患者的能力。

translated by 谷歌翻译

SAFEMYRIDES: Application of Decentralized Control Edge-Computing to Ridesharing Monitoring Services

Samaa Elnagar , Manoj A. Thomas , Kweku-Muata Osei-Bryson

分类：人工智能

2023-01-02

Edge computing is changing the face of many industries and services. Common edge computing models offload computing which is prone to security risks and privacy violation. However, advances in deep learning enabled Internet of Things (IoTs) to take decisions and run cognitive tasks locally. This research introduces a decentralized-control edge model where most computation and decisions are moved to the IoT level. The model aims at decreasing communication to the edge which in return enhances efficiency and decreases latency. The model also avoids data transfer which raises security and privacy risks. To examine the model, we developed SAFEMYRIDES, a scene-aware ridesharing monitoring system where smart phones are detecting violations at the runtime. Current real-time monitoring systems are costly and require continuous network connectivity. The system uses optimized deep learning that run locally on IoTs to detect violations in ridesharing and record violation incidences. The system would enhance safety and security in ridesharing without violating privacy.

translated by 谷歌翻译

What is Cognitive Computing? An Architecture and State of The Art

Samaa Elnagar , Manoj A. Thomas , Kweku-Muata Osei-Bryson

分类：人工智能 | 神经与进化计算

2023-01-02

Cognitive Computing (COC) aims to build highly cognitive machines with low computational resources that respond in real-time. However, scholarly literature shows varying research areas and various interpretations of COC. This calls for a cohesive architecture that delineates the nature of COC. We argue that if Herbert Simon considered the design science is the science of artificial, cognitive systems are the products of cognitive science or 'the newest science of the artificial'. Therefore, building a conceptual basis for COC is an essential step into prospective cognitive computing-based systems. This paper proposes an architecture of COC through analyzing the literature on COC using a myriad of statistical analysis methods. Then, we compare the statistical analysis results with previous qualitative analysis results to confirm our findings. The study also comprehensively surveys the recent research on COC to identify the state of the art and connect the advances in varied research disciplines in COC. The study found that there are three underlaying computing paradigms, Von-Neuman, Neuromorphic Engineering and Quantum Computing, that comprehensively complement the structure of cognitive computation. The research discuss possible applications and open research directions under the COC umbrella.

translated by 谷歌翻译

Argoverse 2: Next Generation Datasets for Self-Driving Perception and Forecasting

Benjamin Wilson , William Qi , Tanmay Agarwal , John Lambert , Jagjeet Singh , Siddhesh Khandelwal , Bowen Pan , Ratnesh Kumar , Andrew Hartnett , Jhony Kaesemodel Pontes

分类：计算机视觉 | 人工智能 | 机器学习 | 机器人

2023-01-02

We introduce Argoverse 2 (AV2) - a collection of three datasets for perception and forecasting research in the self-driving domain. The annotated Sensor Dataset contains 1,000 sequences of multimodal data, encompassing high-resolution imagery from seven ring cameras, and two stereo cameras in addition to lidar point clouds, and 6-DOF map-aligned pose. Sequences contain 3D cuboid annotations for 26 object categories, all of which are sufficiently-sampled to support training and evaluation of 3D perception models. The Lidar Dataset contains 20,000 sequences of unlabeled lidar point clouds and map-aligned pose. This dataset is the largest ever collection of lidar sensor data and supports self-supervised learning and the emerging task of point cloud forecasting. Finally, the Motion Forecasting Dataset contains 250,000 scenarios mined for interesting and challenging interactions between the autonomous vehicle and other actors in each local scene. Models are tasked with the prediction of future motion for "scored actors" in each scenario and are provided with track histories that capture object location, heading, velocity, and category. In all three datasets, each scenario contains its own HD Map with 3D lane and crosswalk geometry - sourced from data captured in six distinct cities. We believe these datasets will support new and existing machine learning research problems in ways that existing datasets do not. All datasets are released under the CC BY-NC-SA 4.0 license.

translated by 谷歌翻译

Novel Reinforcement Learning Algorithm for Suppressing Synchronization in Closed Loop Deep Brain Stimulators

Harsh Agarwal , Heena Rathore

分类：机器学习 | 人工智能

2022-12-25

Parkinson's disease is marked by altered and increased firing characteristics of pathological oscillations in the brain. In other words, it causes abnormal synchronous oscillations and suppression during neurological processing. In order to examine and regulate the synchronization and pathological oscillations in motor circuits, deep brain stimulators (DBS) are used. Although machine learning methods have been applied for the investigation of suppression, these models require large amounts of training data and computational power, both of which pose challenges to resource-constrained DBS. This research proposes a novel reinforcement learning (RL) framework for suppressing the synchronization in neuronal activity during episodes of neurological disorders with less power consumption. The proposed RL algorithm comprises an ensemble of a temporal representation of stimuli and a twin-delayed deep deterministic (TD3) policy gradient algorithm. We quantify the stability of the proposed framework to noise and reduced synchrony using RL for three pathological signaling regimes: regular, chaotic, and bursting, and further eliminate the undesirable oscillations. Furthermore, metrics such as evaluation rewards, energy supplied to the ensemble, and the mean point of convergence were used and compared to other RL algorithms, specifically the Advantage actor critic (A2C), the Actor critic with Kronecker-featured trust region (ACKTR), and the Proximal policy optimization (PPO).

translated by 谷歌翻译

A Seven-Layer Model for Standardising AI Fairness Assessment

Avinash Agarwal , Harsh Agarwal

分类：人工智能 | 机器学习

2022-12-21

Problem statement: Standardisation of AI fairness rules and benchmarks is challenging because AI fairness and other ethical requirements depend on multiple factors such as context, use case, type of the AI system, and so on. In this paper, we elaborate that the AI system is prone to biases at every stage of its lifecycle, from inception to its usage, and that all stages require due attention for mitigating AI bias. We need a standardised approach to handle AI fairness at every stage. Gap analysis: While AI fairness is a hot research topic, a holistic strategy for AI fairness is generally missing. Most researchers focus only on a few facets of AI model-building. Peer review shows excessive focus on biases in the datasets, fairness metrics, and algorithmic bias. In the process, other aspects affecting AI fairness get ignored. The solution proposed: We propose a comprehensive approach in the form of a novel seven-layer model, inspired by the Open System Interconnection (OSI) model, to standardise AI fairness handling. Despite the differences in the various aspects, most AI systems have similar model-building stages. The proposed model splits the AI system lifecycle into seven abstraction layers, each corresponding to a well-defined AI model-building or usage stage. We also provide checklists for each layer and deliberate on potential sources of bias in each layer and their mitigation methodologies. This work will facilitate layer-wise standardisation of AI fairness rules and benchmarking parameters.

translated by 谷歌翻译

AnnoBERT: Effectively Representing Multiple Annotators' Label Choices to Improve Hate Speech Detection

Wenjie Yin , Vibhor Agarwal , Aiqi Jiang , Arkaitz Zubiaga , Nishanth Sastry

分类：自然语言处理

2022-12-20

Supervised approaches generally rely on majority-based labels. However, it is hard to achieve high agreement among annotators in subjective tasks such as hate speech detection. Existing neural network models principally regard labels as categorical variables, while ignoring the semantic information in diverse label texts. In this paper, we propose AnnoBERT, a first-of-its-kind architecture integrating annotator characteristics and label text with a transformer-based model to detect hate speech, with unique representations based on each annotator's characteristics via Collaborative Topic Regression (CTR) and integrate label text to enrich textual representations. During training, the model associates annotators with their label choices given a piece of text; during evaluation, when label information is not available, the model predicts the aggregated label given by the participating annotators by utilising the learnt association. The proposed approach displayed an advantage in detecting hate speech, especially in the minority class and edge cases with annotator disagreement. Improvement in the overall performance is the largest when the dataset is more label-imbalanced, suggesting its practical value in identifying real-world hate speech, as the volume of hate speech in-the-wild is extremely small on social media, when compared with normal (non-hate) speech. Through ablation studies, we show the relative contributions of annotator embeddings and label text to the model performance, and tested a range of alternative annotator embeddings and label text combinations.

translated by 谷歌翻译

Unsupervised Dense Retrieval Deserves Better Positive Pairs: Scalable Augmentation with Query Extraction and Generation

Rui Meng , Ye Liu , Semih Yavuz , Divyansh Agarwal , Lifu Tu , Ning Yu , Jianguo Zhang , Meghana Bhat , Yingbo Zhou

分类：自然语言处理

2022-12-17

Dense retrievers have made significant strides in obtaining state-of-the-art results on text retrieval and open-domain question answering (ODQA). Yet most of these achievements were made possible with the help of large annotated datasets, unsupervised learning for dense retrieval models remains an open problem. In this work, we explore two categories of methods for creating pseudo query-document pairs, named query extraction (QExt) and transferred query generation (TQGen), to augment the retriever training in an annotation-free and scalable manner. Specifically, QExt extracts pseudo queries by document structures or selecting salient random spans, and TQGen utilizes generation models trained for other NLP tasks (e.g., summarization) to produce pseudo queries. Extensive experiments show that dense retrievers trained with individual augmentation methods can perform comparably well with multiple strong baselines, and combining them leads to further improvements, achieving state-of-the-art performance of unsupervised dense retrieval on both BEIR and ODQA datasets.

translated by 谷歌翻译

Reinforcement Learning for Agile Active Target Sensing with a UAV

Harsh Goel , Laura Jarin Lipschitz , Saurav Agarwal , Sandeep Manjanna , Vijay Kumar

分类：机器人 | 人工智能

2022-12-16

Active target sensing is the task of discovering and classifying an unknown number of targets in an environment and is critical in search-and-rescue missions. This paper develops a deep reinforcement learning approach to plan informative trajectories that increase the likelihood for an uncrewed aerial vehicle (UAV) to discover missing targets. Our approach efficiently (1) explores the environment to discover new targets, (2) exploits its current belief of the target states and incorporates inaccurate sensor models for high-fidelity classification, and (3) generates dynamically feasible trajectories for an agile UAV by employing a motion primitive library. Extensive simulations on randomly generated environments show that our approach is more efficient in discovering and classifying targets than several other baselines. A unique characteristic of our approach, in contrast to heuristic informative path planning approaches, is that it is robust to varying amounts of deviations of the prior belief from the true target distribution, thereby alleviating the challenge of designing heuristics specific to the application conditions.

translated by 谷歌翻译