智能论文笔记

Data-Efficient Classification of Radio Galaxies

Ashwin Samudre , Lijo George , Mahak Bansal , Yogesh Wadadekar

分类：机器学习

2020-11-26

无线电星系的连续排放通常可以分为不同的形态学类，如FRI，Frii，弯曲或紧凑。在本文中，我们根据使用深度学习方法使用小规模数据集的深度学习方法来探讨基于形态的无线电星系分类的任务（$ \ SIM 2000 $ Samples）。我们基于双网络应用了几次射击学习技术，并使用预先培训的DENSENET模型进行了先进技术的传输学习技术，如循环学习率和歧视性学习迅速训练模型。我们使用最佳表演模型实现了超过92 \％的分类准确性，其中最大的混乱来源是弯曲和周五型星系。我们的结果表明，专注于一个小但策划数据集随着使用最佳实践来训练神经网络可能会导致良好的结果。自动分类技术对于即将到来的下一代无线电望远镜的调查至关重要，这预计将在不久的将来检测数十万个新的无线电星系。

translated by 谷歌翻译

Image-Based Sorghum Head Counting When You Only Look Once

Lawrence Mosley , Hieu Pham , Yogesh Bansal , Eric Hare

分类：计算机视觉 | 机器学习

2020-09-24

数字农业的现代趋势已经转向人工智能，以进行农作物质量评估和产量估计。在这项工作中，我们记录了如何使用参数调谐的单弹对象检测算法来识别和计算来自空中无人机图像的高粱头。我们的方法涉及一项新颖的探索性分析，该分析确定了高粱图像的关键结构元素，并激发了参数调节的锚盒的选择，这些锚盒对性能产生了重大贡献。这些见解导致了一个深度学习模型的发展，该模型胜过基线模型，并达到了样本外平均平均精度为0.95。

translated by 谷歌翻译

Dubbing in Practice: A Large Scale Study of Human Localization With Insights for Automatic Dubbing

William Brannon , Yogesh Virkar , Brian Thompson

分类：自然语言处理

2022-12-23

We investigate how humans perform the task of dubbing video content from one language into another, leveraging a novel corpus of 319.57 hours of video from 54 professionally produced titles. This is the first such large-scale study we are aware of. The results challenge a number of assumptions commonly made in both qualitative literature on human dubbing and machine-learning literature on automatic dubbing, arguing for the importance of vocal naturalness and translation quality over commonly emphasized isometric (character length) and lip-sync constraints, and for a more qualified view of the importance of isochronic (timing) constraints. We also find substantial influence of the source-side audio on human dubs through channels other than the words of the translation, pointing to the need for research on ways to preserve speech characteristics, as well as semantic transfer such as emphasis/emotion, in automatic dubbing systems.

translated by 谷歌翻译

Comparison and Evaluation of Methods for a Predict+Optimize Problem in Renewable Energy

Christoph Bergmeir , Frits de Nijs , Abishek Sriramulu , Mahdi Abolghasemi , Richard Bean , John Betts , Quang Bui , Nam Trong Dinh , Nils Einecke , Rasul Esmaeilbeigi

分类：人工智能

2022-12-21

Algorithms that involve both forecasting and optimization are at the core of solutions to many difficult real-world problems, such as in supply chains (inventory optimization), traffic, and in the transition towards carbon-free energy generation in battery/load/production scheduling in sustainable energy systems. Typically, in these scenarios we want to solve an optimization problem that depends on unknown future values, which therefore need to be forecast. As both forecasting and optimization are difficult problems in their own right, relatively few research has been done in this area. This paper presents the findings of the ``IEEE-CIS Technical Challenge on Predict+Optimize for Renewable Energy Scheduling," held in 2021. We present a comparison and evaluation of the seven highest-ranked solutions in the competition, to provide researchers with a benchmark problem and to establish the state of the art for this benchmark, with the aim to foster and facilitate research in this area. The competition used data from the Monash Microgrid, as well as weather data and energy market data. It then focused on two main challenges: forecasting renewable energy production and demand, and obtaining an optimal schedule for the activities (lectures) and on-site batteries that lead to the lowest cost of energy. The most accurate forecasts were obtained by gradient-boosted tree and random forest models, and optimization was mostly performed using mixed integer linear and quadratic programming. The winning method predicted different scenarios and optimized over all scenarios jointly using a sample average approximation method.

translated by 谷歌翻译

Rumour detection using graph neural network and oversampling in benchmark Twitter dataset

Shaswat Patel , Prince Bansal , Preeti Kaur

分类：自然语言处理 | 机器学习

2022-12-20

Recently, online social media has become a primary source for new information and misinformation or rumours. In the absence of an automatic rumour detection system the propagation of rumours has increased manifold leading to serious societal damages. In this work, we propose a novel method for building automatic rumour detection system by focusing on oversampling to alleviating the fundamental challenges of class imbalance in rumour detection task. Our oversampling method relies on contextualised data augmentation to generate synthetic samples for underrepresented classes in the dataset. The key idea exploits selection of tweets in a thread for augmentation which can be achieved by introducing a non-random selection criteria to focus the augmentation process on relevant tweets. Furthermore, we propose two graph neural networks(GNN) to model non-linear conversations on a thread. To enhance the tweet representations in our method we employed a custom feature selection technique based on state-of-the-art BERTweet model. Experiments of three publicly available datasets confirm that 1) our GNN models outperform the the current state-of-the-art classifiers by more than 20%(F1-score); 2) our oversampling technique increases the model performance by more than 9%;(F1-score) 3) focusing on relevant tweets for data augmentation via non-random selection criteria can further improve the results; and 4) our method has superior capabilities to detect rumours at very early stage.

translated by 谷歌翻译

Rethinking the Role of Scale for In-Context Learning: An Interpretability-based Case Study at 66 Billion Scale

Hritik Bansal , Karthik Gopalakrishnan , Saket Dingliwal , Sravan Bodapati , Katrin Kirchhoff , Dan Roth

分类：自然语言处理 | 人工智能

2022-12-18

Language models have been shown to perform better with an increase in scale on a wide variety of tasks via the in-context learning paradigm. In this paper, we investigate the hypothesis that the ability of a large language model to in-context learn-perform a task is not uniformly spread across all of its underlying components. Using a 66 billion parameter language model (OPT-66B) across a diverse set of 14 downstream tasks, we find this is indeed the case: $\sim$70% of attention heads and $\sim$20% of feed forward networks can be removed with minimal decline in task performance. We find substantial overlap in the set of attention heads (un)important for in-context learning across tasks and number of in-context examples. We also address our hypothesis through a task-agnostic lens, finding that a small set of attention heads in OPT-66B score highly on their ability to perform primitive induction operations associated with in-context learning, namely, prefix matching and copying. These induction heads overlap with task-specific important heads, suggesting that induction heads are among the heads capable of more sophisticated behaviors associated with in-context learning. Overall, our study provides several insights that indicate large language models may be under-trained to perform in-context learning and opens up questions on how to pre-train language models to more effectively perform in-context learning.

translated by 谷歌翻译

Atrous Space Bender U-Net (ASBU-Net/LogiNet)

Anurag Bansal , Oleg Ostap , Miguel Maestre Trueba , Kristopher Perry

分类：计算机视觉

2022-12-16

$ $With recent advances in CNNs, exceptional improvements have been made in semantic segmentation of high resolution images in terms of accuracy and latency. However, challenges still remain in detecting objects in crowded scenes, large scale variations, partial occlusion, and distortions, while still maintaining mobility and latency. We introduce a fast and efficient convolutional neural network, ASBU-Net, for semantic segmentation of high resolution images that addresses these problems and uses no novelty layers for ease of quantization and embedded hardware support. ASBU-Net is based on a new feature extraction module, atrous space bender layer (ASBL), which is efficient in terms of computation and memory. The ASB layers form a building block that is used to make ASBNet. Since this network does not use any special layers it can be easily implemented, quantized and deployed on FPGAs and other hardware with limited memory. We present experiments on resource and accuracy trade-offs and show strong performance compared to other popular models.

translated by 谷歌翻译

MURMUR: Modular Multi-Step Reasoning for Semi-Structured Data-to-Text Generation

Swarnadeep Saha , Xinyan Velocity Yu , Mohit Bansal , Ramakanth Pasunuru , Asli Celikyilmaz

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-16

Prompting large language models has enabled significant recent progress in multi-step reasoning over text. However, when applied to text generation from semi-structured data (e.g., graphs or tables), these methods typically suffer from low semantic coverage, hallucination, and logical inconsistency. We propose MURMUR, a neuro-symbolic modular approach to text generation from semi-structured data with multi-step reasoning. MURMUR is a best-first search method that generates reasoning paths using: (1) neural and symbolic modules with specific linguistic and logical skills, (2) a grammar whose production rules define valid compositions of modules, and (3) value functions that assess the quality of each reasoning step. We conduct experiments on two diverse data-to-text generation tasks like WebNLG and LogicNLG. These tasks differ in their data representations (graphs and tables) and span multiple linguistic and logical skills. MURMUR obtains significant improvements over recent few-shot baselines like direct prompting and chain-of-thought prompting, while also achieving comparable performance to fine-tuned GPT-2 on out-of-domain data. Moreover, human evaluation shows that MURMUR generates highly faithful and correct reasoning paths that lead to 26% more logically consistent summaries on LogicNLG, compared to direct prompting.

translated by 谷歌翻译

Vision Transformers are Parameter-Efficient Audio-Visual Learners

Yan-Bo Lin , Yi-Lin Sung , Jie Lei , Mohit Bansal , Gedas Bertasius

分类：计算机视觉 | 自然语言处理 | 机器学习

2022-12-15

Vision transformers (ViTs) have achieved impressive results on various computer vision tasks in the last several years. In this work, we study the capability of frozen ViTs, pretrained only on visual data, to generalize to audio-visual data without finetuning any of its original parameters. To do so, we propose a latent audio-visual hybrid (LAVISH) adapter that adapts pretrained ViTs to audio-visual tasks by injecting a small number of trainable parameters into every layer of a frozen ViT. To efficiently fuse visual and audio cues, our LAVISH adapter uses a small set of latent tokens, which form an attention bottleneck, thus, eliminating the quadratic cost of standard cross-attention. Compared to the existing modality-specific audio-visual methods, our approach achieves competitive or even better performance on various audio-visual tasks while using fewer tunable parameters and without relying on costly audio pretraining or external audio encoders. Our code is available at https://genjib.github.io/project_page/LAVISH/

translated by 谷歌翻译

MAntRA: A framework for model agnostic reliability analysis

Yogesh Chandrakant Mathpati , Kalpesh Sanjay More , Tapas Tripura , Rajdip Nayek , Souvik Chakraborty

分类：机器学习 | (统计)机器学习

2022-12-13

We propose a novel model agnostic data-driven reliability analysis framework for time-dependent reliability analysis. The proposed approach -- referred to as MAntRA -- combines interpretable machine learning, Bayesian statistics, and identifying stochastic dynamic equation to evaluate reliability of stochastically-excited dynamical systems for which the governing physics is \textit{apriori} unknown. A two-stage approach is adopted: in the first stage, an efficient variational Bayesian equation discovery algorithm is developed to determine the governing physics of an underlying stochastic differential equation (SDE) from measured output data. The developed algorithm is efficient and accounts for epistemic uncertainty due to limited and noisy data, and aleatoric uncertainty because of environmental effect and external excitation. In the second stage, the discovered SDE is solved using a stochastic integration scheme and the probability failure is computed. The efficacy of the proposed approach is illustrated on three numerical examples. The results obtained indicate the possible application of the proposed approach for reliability analysis of in-situ and heritage structures from on-site measurements.

translated by 谷歌翻译