智能论文笔记

ULTRA: A Data-driven Approach for Recommending Team Formation in Response to Proposal Calls

Biplav Srivastava , Tarmo Koppel , Sai Teja Paladi , Siva Likitha Valluru , Rohit Sharma , Owen Bond

分类：人工智能

2022-01-13

We introduce an emerging AI-based approach and prototype system for assisting team formation when researchers respond to calls for proposals from funding agencies. This is an instance of the general problem of building teams when demand opportunities come periodically and potential members may vary over time. The novelties of our approach are that we: (a) extract technical skills needed about researchers and calls from multiple data sources and normalize them using Natural Language Processing (NLP) techniques, (b) build a prototype solution based on matching and teaming based on constraints, (c) describe initial feedback about system from researchers at a University to deploy, and (d) create and publish a dataset that others can use.

translated by 谷歌翻译

ACLNet: An Attention and Clustering-based Cloud Segmentation Network

Dhruv Makwana , Subhrajit Nag , Onkar Susladkar , Gayatri Deshmukh , Sai Chandra Teja R , Sparsh Mittal , C Krishna Mohan

分类：计算机视觉 | 人工智能

2022-07-13

我们提出了一种名为ACLNET的新型深度学习模型，用于从地面图像中分割云。ACLNET同时使用深神经网络和机器学习（ML）算法来提取互补功能。具体而言，它使用有效网络-B0作为骨干，“``trous tos blacial pyramid boming''（ASPP）在多个接受场上学习，并从图像中提取细节细节。ACLNET还使用K-均值聚类来更精确地提取云边界。ACLNET对白天和夜间图像都有效。它提供的错误率较低，较高的召回率和更高的F1得分比Art最先进的云分割模型。ACLNET的源代码可在此处获得：https：//github.com/ckmvigil/aclnet。

translated by 谷歌翻译

WaferSegClassNet -- A Light-weight Network for Classification and Segmentation of Semiconductor Wafer Defects

Subhrajit Nag , Dhruv Makwana , Sai Chandra Teja R , Sparsh Mittal , C Krishna Mohan

分类：计算机视觉 | 机器学习

2022-07-03

随着半导体晶片的整合密度和设计的复杂性的增加，它们中缺陷的幅度和复杂性也在上升。由于对晶圆缺陷的手动检查是昂贵的，因此高度需要基于自动的人工智能（AI）计算机视觉方法。先前关于缺陷分析的作品具有多个局限性，例如准确性低以及对分类和分割的单独模型的需求。为了分析混合型缺陷，一些以前的作品需要为每种缺陷类型分别训练一个模型，这是不可估计的。在本文中，我们介绍了基于编码器架构的新型网络WafersegClassnet（WSCN）。 WSCN执行单个和混合型晶圆缺陷的同时分类和分割。 WSCN使用“共享编码器”进行分类和细分，允许训练WSCN端到端。我们使用N-PAIR对比度损失首先预处理编码器，然后使用BCE-DICE损失进行分割，并进行分类的分类横向损失。使用N-PAIR对比度损失有助于更好地嵌入晶圆图的潜在维度。 WSCN的模型大小仅为0.51MB，仅执行0.2m的拖鞋。因此，它比其他最先进的型号轻得多。同样，它仅需要150个时期才能收敛，而先前的工作需要4,000个时代。我们在具有38,015张图像的混合WM38数据集上评估了我们的模型。 WSCN的平均分类精度为98.2％，骰子系数为0.9999。我们是第一个在混合WM38数据集上显示分割结果的人。可以从https://github.com/ckmvigil/wafersegclassnet获得源代码。

translated by 谷歌翻译

PESTO: Switching Point based Dynamic and Relative Positional Encoding for Code-Mixed Languages

Mohsin Ali , Kandukuri Sai Teja , Sumanth Manduru , Parth Patwa , Amitava Das

分类：自然语言处理 | 人工智能 | 机器学习

2021-11-12

NLP应用于代码混合（cm）或混合文本的主要势头最近，主要原因是印度，墨西哥，欧洲，美国欧洲地区的多语素社会中的社交媒体通信中语言混合的普遍性。Word Embeddings是今天任何NLP系统的基本构建块，但嵌入CM语言的单词是一个未开发的领域。CM Word Embeddings的主要瓶颈是语言交换机的切换点。由于在所见示例中的高方差，这些位置缺乏在上下文和统计系统中未能模拟这种现象。在本文中，我们介绍了我们对应用基于切换点的位置编码技术进行CM语言的初步观察，特别是HINGISH（HINDI - 英语）。结果仅比SOTA更长，但很明显，位置编码可以为CM文本培训定位敏感语言模型的有效方法。

translated by 谷歌翻译

PaletteNeRF: Palette-based Appearance Editing of Neural Radiance Fields

Zhengfei Kuang , Fujun Luan , Sai Bi , Zhixin Shu , Gordon Wetzstein , Kalyan Sunkavalli

分类：计算机视觉

2022-12-21

Recent advances in neural radiance fields have enabled the high-fidelity 3D reconstruction of complex scenes for novel view synthesis. However, it remains underexplored how the appearance of such representations can be efficiently edited while maintaining photorealism. In this work, we present PaletteNeRF, a novel method for photorealistic appearance editing of neural radiance fields (NeRF) based on 3D color decomposition. Our method decomposes the appearance of each 3D point into a linear combination of palette-based bases (i.e., 3D segmentations defined by a group of NeRF-type functions) that are shared across the scene. While our palette-based bases are view-independent, we also predict a view-dependent function to capture the color residual (e.g., specular shading). During training, we jointly optimize the basis functions and the color palettes, and we also introduce novel regularizers to encourage the spatial coherence of the decomposition. Our method allows users to efficiently edit the appearance of the 3D scene by modifying the color palettes. We also extend our framework with compressed semantic features for semantic-aware appearance editing. We demonstrate that our technique is superior to baseline methods both quantitatively and qualitatively for appearance editing of complex real-world scenes.

translated by 谷歌翻译

IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation metrics for Indian Languages

Ananya B. Sai , Vignesh Nagarajan , Tanay Dixit , Raj Dabre , Anoop Kunchukuttan , Pratyush Kumar , Mitesh M. Khapra

分类：自然语言处理

2022-12-20

The rapid growth of machine translation (MT) systems has necessitated comprehensive studies to meta-evaluate evaluation metrics being used, which enables a better selection of metrics that best reflect MT quality. Unfortunately, most of the research focuses on high-resource languages, mainly English, the observations for which may not always apply to other languages. Indian languages, having over a billion speakers, are linguistically different from English, and to date, there has not been a systematic study of evaluating MT systems from English into Indian languages. In this paper, we fill this gap by creating an MQM dataset consisting of 7000 fine-grained annotations, spanning 5 Indian languages and 7 MT systems, and use it to establish correlations between annotator scores and scores obtained using existing automatic metrics. Our results show that pre-trained metrics, such as COMET, have the highest correlations with annotator scores. Additionally, we find that the metrics do not adequately capture fluency-based errors in Indian languages, and there is a need to develop metrics focused on Indian languages. We hope that our dataset and analysis will help promote further research in this area.

translated by 谷歌翻译

Learning Inter-Annual Flood Loss Risk Models From Historical Flood Insurance Claims and Extreme Rainfall Data

Joaquin Salas , Anamitra Saha , Sai Ravela

分类：机器学习 | (统计)机器学习

2022-12-15

Flooding is one of the most disastrous natural hazards, responsible for substantial economic losses. A predictive model for flood-induced financial damages is useful for many applications such as climate change adaptation planning and insurance underwriting. This research assesses the predictive capability of regressors constructed on the National Flood Insurance Program (NFIP) dataset using neural networks (Conditional Generative Adversarial Networks), decision trees (Extreme Gradient Boosting), and kernel-based regressors (Gaussian Process). The assessment highlights the most informative predictors for regression. The distribution for claims amount inference is modeled with a Burr distribution permitting the introduction of a bias correction scheme and increasing the regressor's predictive capability. Aiming to study the interaction with physical variables, we incorporate Daymet rainfall estimation to NFIP as an additional predictor. A study on the coastal counties in the eight US South-West states resulted in an $R^2=0.807$. Further analysis of 11 counties with a significant number of claims in the NFIP dataset reveals that Extreme Gradient Boosting provides the best results, that bias correction significantly improves the similarity with the reference distribution, and that the rainfall predictor strengthens the regressor performance.

translated by 谷歌翻译

Online Real-time Learning of Dynamical Systems from Noisy Streaming Data

S. Sinha , Sai P. Nandanoori , David Barajas-Solano

分类：机器学习

2022-12-10

Recent advancements in sensing and communication facilitate obtaining high-frequency real-time data from various physical systems like power networks, climate systems, biological networks, etc. However, since the data are recorded by physical sensors, it is natural that the obtained data is corrupted by measurement noise. In this paper, we present a novel algorithm for online real-time learning of dynamical systems from noisy time-series data, which employs the Robust Koopman operator framework to mitigate the effect of measurement noise. The proposed algorithm has three main advantages: a) it allows for online real-time monitoring of a dynamical system; b) it obtains a linear representation of the underlying dynamical system, thus enabling the user to use linear systems theory for analysis and control of the system; c) it is computationally fast and less intensive than the popular Extended Dynamic Mode Decomposition (EDMD) algorithm. We illustrate the efficiency of the proposed algorithm by applying it to identify the Van der Pol oscillator, the IEEE 68 bus system, and a ring network of Van der Pol oscillators.

translated by 谷歌翻译

POQue: Asking Participant-specific Outcome Questions for a Deeper Understanding of Complex Events

Sai Vallurupalli , Sayontan Ghosh , Katrin Erk , Niranjan Balasubramanian , Francis Ferraro

分类：自然语言处理

2022-12-05

Knowledge about outcomes is critical for complex event understanding but is hard to acquire. We show that by pre-identifying a participant in a complex event, crowd workers are able to (1) infer the collective impact of salient events that make up the situation, (2) annotate the volitional engagement of participants in causing the situation, and (3) ground the outcome of the situation in state changes of the participants. By creating a multi-step interface and a careful quality control strategy, we collect a high quality annotated dataset of 8K short newswire narratives and ROCStories with high inter-annotator agreement (0.74-0.96 weighted Fleiss Kappa). Our dataset, POQue (Participant Outcome Questions), enables the exploration and development of models that address multiple aspects of semantic understanding. Experimentally, we show that current language models lag behind human performance in subtle ways through our task formulations that target abstract and specific comprehension of a complex event, its outcome, and a participant's influence over the event culmination.

translated by 谷歌翻译

Downscaling Extreme Rainfall Using Physical-Statistical Generative Adversarial Learning

Anamitra Saha , Sai Ravela

分类：机器学习

2022-12-02

Modeling the risk of extreme weather events in a changing climate is essential for developing effective adaptation and mitigation strategies. Although the available low-resolution climate models capture different scenarios, accurate risk assessment for mitigation and adaption often demands detail that they typically cannot resolve. Here, we develop a dynamic data-driven downscaling (super-resolution) method that incorporates physics and statistics in a generative framework to learn the fine-scale spatial details of rainfall. Our method transforms coarse-resolution ($0.25^{\circ} \times 0.25^{\circ}$) climate model outputs into high-resolution ($0.01^{\circ} \times 0.01^{\circ}$) rainfall fields while efficaciously quantifying uncertainty. Results indicate that the downscaled rainfall fields closely match observed spatial fields and their risk distributions.

translated by 谷歌翻译