智能论文笔记

Intelligent Acoustic Module for Autonomous Vehicles using Fast Gated Recurrent approach

Raghav Rawat , Shreyash Gupta , Shreyas Mohapatra , Sujata Priyambada Mishra , Sreesankar Rajagopal

分类：机器学习

2021-12-06

本文在资源受限边缘设备中阐明了声学单音和多音分类的模型。所提出的模型是最先进的快速准确稳定的微小门控复发性神经网络。通过使用较低的参数，通过使用更高的效率和降噪算法的参数，该模型与先前的假设方法相比，该模型改善了性能度量和较低尺寸。该模型实现为声学AI模块，专注于应用声音识别，本地化和部署，如自主汽车的AI系统。此外，包括本地化技术的潜力将新的维度添加到自动车辆中存在的多色分类器，因为它未来城市城市和发展中国家的需求增加。

translated by 谷歌翻译

Multimodal Wildland Fire Smoke Detection

Siddhant Baldota , Shreyas Anantha Ramaprasad , Jaspreet Kaur Bhamra , Shane Luna , Ravi Ramachandra , Eugene Zen , Harrison Kim , Daniel Crawl , Ismael Perez , Ilkay Altintas

分类：计算机视觉

2022-12-29

Research has shown that climate change creates warmer temperatures and drier conditions, leading to longer wildfire seasons and increased wildfire risks in the United States. These factors have in turn led to increases in the frequency, extent, and severity of wildfires in recent years. Given the danger posed by wildland fires to people, property, wildlife, and the environment, there is an urgency to provide tools for effective wildfire management. Early detection of wildfires is essential to minimizing potentially catastrophic destruction. In this paper, we present our work on integrating multiple data sources in SmokeyNet, a deep learning model using spatio-temporal information to detect smoke from wildland fires. Camera image data is integrated with weather sensor measurements and processed by SmokeyNet to create a multimodal wildland fire smoke detection system. We present our results comparing performance in terms of both accuracy and time-to-detection for multimodal data vs. a single data source. With a time-to-detection of only a few minutes, SmokeyNet can serve as an automated early notification system, providing a useful tool in the fight against destructive wildfires.

translated by 谷歌翻译

Nostradamus: Weathering Worth

Alapan Chaudhuri , Zeeshan Ahmed , Ashwin Rao , Shivansh Subramanian , Shreyas Pradhan , Abhishek Mittal

分类：机器学习

2022-12-08

Nostradamus, inspired by the French astrologer and reputed seer, is a detailed study exploring relations between environmental factors and changes in the stock market. In this paper, we analyze associative correlation and causation between environmental elements and stock prices based on the US financial market, global climate trends, and daily weather records to demonstrate significant relationships between climate and stock price fluctuation. Our analysis covers short and long-term rises and dips in company stock performances. Lastly, we take four natural disasters as a case study to observe their effect on the emotional state of people and their influence on the stock market.

translated by 谷歌翻译

In-Hand 3D Object Scanning from an RGB Sequence

Shreyas Hampali , Tomas Hodan , Luan Tran , Lingni Ma , Cem Keskin , Vincent Lepetit

分类：计算机视觉

2022-11-28

We propose a method for in-hand 3D scanning of an unknown object from a sequence of color images. We cast the problem as reconstructing the object surface from un-posed multi-view images and rely on a neural implicit surface representation that captures both the geometry and the appearance of the object. By contrast with most NeRF-based methods, we do not assume that the camera-object relative poses are known and instead simultaneously optimize both the object shape and the pose trajectory. As global optimization over all the shape and pose parameters is prone to fail without coarse-level initialization of the poses, we propose an incremental approach which starts by splitting the sequence into carefully selected overlapping segments within which the optimization is likely to succeed. We incrementally reconstruct the object shape and track the object poses independently within each segment, and later merge all the segments by aligning poses estimated at the overlapping frames. Finally, we perform a global optimization over all the aligned segments to achieve full reconstruction. We experimentally show that the proposed method is able to reconstruct the shape and color of both textured and challenging texture-less objects, outperforms classical methods that rely only on appearance features, and its performance is close to recent methods that assume known camera poses.

translated by 谷歌翻译

Automated MRI Field of View Prescription from Region of Interest Prediction by Intra-stack Attention Neural Network

Ke Lei , Ali B. Syed , Xucheng Zhu , John M. Pauly , Shreyas S. Vasanawala

分类：计算机视觉

2022-11-09

Manual prescription of the field of view (FOV) by MRI technologists is variable and prolongs the scanning process. Often, the FOV is too large or crops critical anatomy. We propose a deep-learning framework, trained by radiologists' supervision, for automating FOV prescription. An intra-stack shared feature extraction network and an attention network are used to process a stack of 2D image inputs to generate output scalars defining the location of a rectangular region of interest (ROI). The attention mechanism is used to make the model focus on the small number of informative slices in a stack. Then the smallest FOV that makes the neural network predicted ROI free of aliasing is calculated by an algebraic operation derived from MR sampling theory. We retrospectively collected 595 cases between February 2018 and February 2022. The framework's performance is examined quantitatively with intersection over union (IoU) and pixel error on position, and qualitatively with a reader study. We use the t-test for comparing quantitative results from all models and a radiologist. The proposed model achieves an average IoU of 0.867 and average ROI position error of 9.06 out of 512 pixels on 80 test cases, significantly better (P<0.05) than two baseline models and not significantly different from a radiologist (P>0.12). Finally, the FOV given by the proposed framework achieves an acceptance rate of 92% from an experienced radiologist.

translated by 谷歌翻译

The (In)Effectiveness of Intermediate Task Training For Domain Adaptation and Cross-Lingual Transfer Learning

Sovesh Mohapatra , Somesh Mohapatra

分类：自然语言处理

2022-10-03

Transfer learning from large language models (LLMs) has emerged as a powerful technique to enable knowledge-based fine-tuning for a number of tasks, adaptation of models for different domains and even languages. However, it remains an open question, if and when transfer learning will work, i.e. leading to positive or negative transfer. In this paper, we analyze the knowledge transfer across three natural language processing (NLP) tasks - text classification, sentimental analysis, and sentence similarity, using three LLMs - BERT, RoBERTa, and XLNet - and analyzing their performance, by fine-tuning on target datasets for domain and cross-lingual adaptation tasks, with and without an intermediate task training on a larger dataset. Our experiments showed that fine-tuning without an intermediate task training can lead to a better performance for most tasks, while more generalized tasks might necessitate a preceding intermediate task training step. We hope that this work will act as a guide on transfer learning to NLP practitioners.

translated by 谷歌翻译

360FusionNeRF: Panoramic Neural Radiance Fields with Joint Guidance

Shreyas Kulkarni , Peng Yin , Sebastian Scherer

分类：计算机视觉

2022-09-28

我们提出了一种基于神经辐射场（NERF）的单个$ 360^\ PANORAMA图像合成新视图的方法。在类似环境中的先前研究依赖于多层感知的邻居插值能力来完成由遮挡引起的丢失区域，这导致其预测中的伪像。我们提出了360Fusionnerf，这是一个半监督的学习框架，我们介绍几何监督和语义一致性，以指导渐进式培训过程。首先，将输入图像重新投影至$ 360^\ Circ $图像，并在其他相机位置提取辅助深度图。除NERF颜色指导外，深度监督还改善了合成视图的几何形状。此外，我们引入了语义一致性损失，鼓励新观点的现实渲染。我们使用预先训练的视觉编码器（例如剪辑）提取这些语义功能，这是一个视觉变压器，经过数以千计的不同2D照片，并通过自然语言监督从网络中挖掘出来。实验表明，我们提出的方法可以在保留场景的特征的同时产生未观察到的区域的合理完成。 360fusionnerf在各种场景中接受培训时，转移到合成结构3D数据集（PSNR〜5％，SSIM〜3％lpips〜13％）时，始终达到最先进的性能，SSIM〜3％LPIPS〜9％）和replica360数据集（PSNR〜8％，SSIM〜2％LPIPS〜18％）。

translated by 谷歌翻译

Set-Valued Shadow Matching Using Zonotopes for 3-D Map-Aided GNSS Localization

Sriramya Bhamidipati , Shreyas Kousik , Grace Gao

分类：机器人

2022-09-28

与许多返回点值估计值的城市本地化方法不同，设定值表示可以通过确保可能的位置的连续体遵守安全限制来实现鲁棒性。具有设置值估计的一种策略是基于GNSS的阴影匹配〜（SM），其中使用三维（3-D）地图来计算GNSS阴影（在视线范围内被阻止）。但是，SM需要一个值值的网格才能计算障碍，并且精确限制了网格分辨率。我们建议针对Set值3-D MAPAID ADED GNSS本地化的Zonotope Shadow匹配（ZSM）。 ZSM代表建筑物和GNSS阴影，使用约束的ZONOTOPE，这是一种凸多属表示，该表示可以使用快速矢量串联操作实现传播设置值估计。 ZSM从粗糙的设定值开始，根据接收到的载体到噪声密度所判断的接收器在每个阴影内部或外部的接收器。我们使用模拟实验在简单的3-D示例图和旧金山密集的3-D地图上展示了算法的性能。

translated by 谷歌翻译

Sentiment is all you need to win US Presidential elections

Sovesh Mohapatra , Somesh Mohapatra

分类：自然语言处理

2022-09-27

选举演讲在交流候选人的愿景和使命中起着不可或缺的作用。从崇高的承诺到泥泞，选举候选人都对所有人说明了。但是，关于选民究竟赢得了什么胜利，仍然存在一个公开的问题。在这项工作中，我们使用最先进的自然语言处理方法来研究共和党候选人唐纳德·特朗普（Donald Trump）和民主党候选人乔·拜登（Joe Biden）的讲话和情感，他们争夺2020年美国总统大选。比较美国的种族二分法，我们分析了导致不同候选人的胜利和失败的原因。我们认为，这项工作将为选举竞选策略提供信息，并为与各种人群进行沟通提供基础。

translated by 谷歌翻译

Robust Online and Distributed Mean Estimation Under Adversarial Data Corruption

Tong Yao , Shreyas Sundaram

分类：机器学习

2022-09-17

在存在对抗数据攻击的情况下，我们研究在线和分布式方案中的强大平均估计。在每个时间步骤中，网络中的每个代理都会收到一个潜在损坏的数据点，其中数据点最初是独立的，并且是随机变量的相同分布的样本。我们建议所有代理商在线和分发算法，以渐近地估计平均值。我们将估计值的错误结合和收敛属性提供给我们算法下的真实均值。基于网络拓扑，我们进一步评估了每个代理商在合并邻居的数据和仅在本地观察中学习之间的融合率的权衡。

translated by 谷歌翻译