智能论文笔记

Towards Automating Retinoscopy for Refractive Error Diagnosis

Aditya Aggarwal , Siddhartha Gairola , Uddeshya Upadhyay , Akshay P Vasishta , Diwakar Rao , Aditya Goyal , Kaushik Murali , Nipun Kwatra , Mohit Jain

分类：计算机视觉

2022-08-10

折射率是最常见的眼睛障碍，是可更正视觉障碍的关键原因，造成了美国近80％的视觉障碍。可以使用多种方法诊断折射误差，包括主观折射，视网膜镜检查和自动磨蚀器。尽管主观折射是黄金标准，但它需要患者的合作，因此不适合婴儿，幼儿和发育迟缓的成年人。视网膜镜检查是一种客观折射方法，不需要患者的任何输入。但是，视网膜镜检查需要镜头套件和训练有素的检查员，这限制了其用于大规模筛查的使用。在这项工作中，我们通过将智能手机连接到视网膜镜和录制视网膜镜视频与患者戴着定制的纸框架来自动化自动化。我们开发了一个视频处理管道，该管道将视网膜视频视为输入，并根据我们提出的视网膜镜检查数学模型的扩展来估算净屈光度错误。我们的系统减轻了对镜头套件的需求，可以由未经培训的检查员进行。在一项185只眼睛的临床试验中，我们的灵敏度为91.0％，特异性为74.0％。此外，与主观折射测量相比，我们方法的平均绝对误差为0.75 $ \ pm $ 0.67D。我们的结果表明，我们的方法有可能用作现实世界中医疗设置中的基于视网膜镜检查的折射率筛选工具。

translated by 谷歌翻译

Overinterpretation reveals image classification model pathologies

Brandon Carter , Siddhartha Jain , Jonas Mueller , David Gifford

分类：机器学习 | 计算机视觉 | (统计)机器学习

2020-03-19

图像分类器通常在其测试设置精度上进行评分，但高精度可以屏蔽微妙类型的模型故障。我们发现高分卷积神经网络（CNNS）在流行的基准上表现出令人不安的病理，即使在没有语义突出特征的情况下，即使在没有语义突出特征的情况下也能够显示高精度。当模型提供没有突出的输入功能而无突出的频率决定时，我们说分类器已经过度解释了它的输入，找到了太多的课程 - 以对人类荒谬的模式。在这里，我们展示了在CiFar-10和Imagenet上培训的神经网络患有过度诠释，我们发现CIFAR-10上的模型即使在屏蔽95％的输入图像中，人类不能在剩余像素子集中辨别出突出的特征。我们介绍了批量梯度SIS，一种用于发现复杂数据集的足够输入子集的新方法，并使用此方法显示故事中的边界像素的充分性以进行培训和测试。虽然这些模式在现实世界部署中移植了潜在的模型脆弱性，但它们实际上是基准的有效统计模式，单独就足以实现高测试精度。与对手示例不同，过度解释依赖于未修改的图像像素。我们发现合奏和输入辍学可以帮助缓解过度诠释。

translated by 谷歌翻译

Towards Proactively Forecasting Sentence-Specific Information Popularity within Online News Documents

Sayar Ghosh Roy , Anshul Padhi , Risubh Jain , Manish Gupta , Vasudeva Varma

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-31

Multiple studies have focused on predicting the prospective popularity of an online document as a whole, without paying attention to the contributions of its individual parts. We introduce the task of proactively forecasting popularities of sentences within online news documents solely utilizing their natural language content. We model sentence-specific popularity forecasting as a sequence regression task. For training our models, we curate InfoPop, the first dataset containing popularity labels for over 1.7 million sentences from over 50,000 online news documents. To the best of our knowledge, this is the first dataset automatically created using streams of incoming search engine queries to generate sentence-level popularity annotations. We propose a novel transfer learning approach involving sentence salience prediction as an auxiliary task. Our proposed technique coupled with a BERT-based neural model exceeds nDCG values of 0.8 for proactive sentence-specific popularity forecasting. Notably, our study presents a non-trivial takeaway: though popularity and salience are different concepts, transfer learning from salience prediction enhances popularity forecasting. We release InfoPop and make our code publicly available: https://github.com/sayarghoshroy/InfoPopularity

translated by 谷歌翻译

An Investigation of Indian Native Language Phonemic Influences on L2 English Pronunciations

Shelly Jain , Priyanshi Pal , Anil Vuppala , Prasanta Ghosh , Chiranjeevi Yarra

分类：自然语言处理

2022-12-19

Speech systems are sensitive to accent variations. This is especially challenging in the Indian context, with an abundance of languages but a dearth of linguistic studies characterising pronunciation variations. The growing number of L2 English speakers in India reinforces the need to study accents and L1-L2 interactions. We investigate the accents of Indian English (IE) speakers and report in detail our observations, both specific and common to all regions. In particular, we observe the phonemic variations and phonotactics occurring in the speakers' native languages and apply this to their English pronunciations. We demonstrate the influence of 18 Indian languages on IE by comparing the native language pronunciations with IE pronunciations obtained jointly from existing literature studies and phonetically annotated speech of 80 speakers. Consequently, we are able to validate the intuitions of Indian language influences on IE pronunciations by justifying pronunciation rules from the perspective of Indian language phonology. We obtain a comprehensive description in terms of universal and region-specific characteristics of IE, which facilitates accent conversion and adaptation of existing ASR and TTS systems to different Indian accents.

translated by 谷歌翻译

Hybrid Quantum Generative Adversarial Networks for Molecular Simulation and Drug Discovery

Prateek Jain , Srinjoy Ganguly

分类：机器学习

2022-12-15

In molecular research, simulation \& design of molecules are key areas with significant implications for drug development, material science, and other fields. Current classical computational power falls inadequate to simulate any more than small molecules, let alone protein chains on hundreds of peptide. Therefore these experiment are done physically in wet-lab, but it takes a lot of time \& not possible to examine every molecule due to the size of the search area, tens of billions of dollars are spent every year in these research experiments. Molecule simulation \& design has lately advanced significantly by machine learning models, A fresh perspective on the issue of chemical synthesis is provided by deep generative models for graph-structured data. By optimising differentiable models that produce molecular graphs directly, it is feasible to avoid costly search techniques in the discrete and huge space of chemical structures. But these models also suffer from computational limitations when dimensions become huge and consume huge amount of resources. Quantum Generative machine learning in recent years have shown some empirical results promising significant advantages over classical counterparts.

translated by 谷歌翻译

Child PalmID: Contactless Palmprint Recognition

Anil K. Jain , Akash Godbole , Anjoo Bhatnagar , Prem Sewak Sudhish

分类：计算机视觉

2022-12-14

Developing and least developed countries face the dire challenge of ensuring that each child in their country receives required doses of vaccination, adequate nutrition and proper medication. International agencies such as UNICEF, WHO and WFP, among other organizations, strive to find innovative solutions to determine which child has received the benefits and which have not. Biometric recognition systems have been sought out to help solve this problem. To that end, this report establishes a baseline accuracy of a commercial contactless palmprint recognition system that may be deployed for recognizing children in the age group of one to five years old. On a database of contactless palmprint images of one thousand unique palms from 500 children, we establish SOTA authentication accuracy of 90.85% @ FAR of 0.01%, rank-1 identification accuracy of 99.0% (closed set), and FPIR=0.01 @ FNIR=0.3 for open-set identification using PalmMobile SDK from Armatura.

translated by 谷歌翻译

Selective classification using a robust meta-learning approach

Nishant Jain , Pradeep Shenoy

分类：机器学习

2022-12-12

Selective classification involves identifying the subset of test samples that a model can classify with high accuracy, and is important for applications such as automated medical diagnosis. We argue that this capability of identifying uncertain samples is valuable for training classifiers as well, with the aim of building more accurate classifiers. We unify these dual roles by training a single auxiliary meta-network to output an importance weight as a function of the instance. This measure is used at train time to reweight training data, and at test-time to rank test instances for selective classification. A second, key component of our proposal is the meta-objective of minimizing dropout variance (the variance of classifier output when subjected to random weight dropout) for training the metanetwork. We train the classifier together with its metanetwork using a nested objective of minimizing classifier loss on training data and meta-loss on a separate meta-training dataset. We outperform current state-of-the-art on selective classification by substantial margins--for instance, upto 1.9% AUC and 2% accuracy on a real-world diabetic retinopathy dataset. Finally, our meta-learning framework extends naturally to unsupervised domain adaptation, given our unsupervised variance minimization meta-objective. We show cumulative absolute gains of 3.4% / 3.3% accuracy and AUC over the other baselines in domain shift settings on the Retinopathy dataset using unsupervised domain adaptation.

translated by 谷歌翻译

Learning on non-stationary data with re-weighting

Nishant Jain , Pradeep Shenoy

分类：机器学习

2022-12-12

Many real-world learning scenarios face the challenge of slow concept drift, where data distributions change gradually over time. In this setting, we pose the problem of learning temporally sensitive importance weights for training data, in order to optimize predictive accuracy. We propose a class of temporal reweighting functions that can capture multiple timescales of change in the data, as well as instance-specific characteristics. We formulate a bi-level optimization criterion, and an associated meta-learning algorithm, by which these weights can be learned. In particular, our formulation trains an auxiliary network to output weights as a function of training instances, thereby compactly representing the instance weights. We validate our temporal reweighting scheme on a large real-world dataset of 39M images spread over a 9 year period. Our extensive experiments demonstrate the necessity of instance-based temporal reweighting in the dataset, and achieve significant improvements to classical batch-learning approaches. Further, our proposal easily generalizes to a streaming setting and shows significant gains compared to recent continual learning methods.

translated by 谷歌翻译

Structured information extraction from complex scientific text with fine-tuned large language models

Alexander Dunn , John Dagdelen , Nicholas Walker , Sanghoon Lee , Andrew S. Rosen , Gerbrand Ceder , Kristin Persson , Anubhav Jain

分类：自然语言处理

2022-12-10

Intelligently extracting and linking complex scientific information from unstructured text is a challenging endeavor particularly for those inexperienced with natural language processing. Here, we present a simple sequence-to-sequence approach to joint named entity recognition and relation extraction for complex hierarchical information in scientific text. The approach leverages a pre-trained large language model (LLM), GPT-3, that is fine-tuned on approximately 500 pairs of prompts (inputs) and completions (outputs). Information is extracted either from single sentences or across sentences in abstracts/passages, and the output can be returned as simple English sentences or a more structured format, such as a list of JSON objects. We demonstrate that LLMs trained in this way are capable of accurately extracting useful records of complex scientific knowledge for three representative tasks in materials chemistry: linking dopants with their host materials, cataloging metal-organic frameworks, and general chemistry/phase/morphology/application information extraction. This approach represents a simple, accessible, and highly-flexible route to obtaining large databases of structured knowledge extracted from unstructured text. An online demo is available at http://www.matscholar.com/info-extraction.

translated by 谷歌翻译

Task-Directed Exploration in Continuous POMDPs for Robotic Manipulation of Articulated Objects

Aidan Curtis , Leslie Kaelbling , Siddarth Jain

分类：机器人

2022-12-08

Representing and reasoning about uncertainty is crucial for autonomous agents acting in partially observable environments with noisy sensors. Partially observable Markov decision processes (POMDPs) serve as a general framework for representing problems in which uncertainty is an important factor. Online sample-based POMDP methods have emerged as efficient approaches to solving large POMDPs and have been shown to extend to continuous domains. However, these solutions struggle to find long-horizon plans in problems with significant uncertainty. Exploration heuristics can help guide planning, but many real-world settings contain significant task-irrelevant uncertainty that might distract from the task objective. In this paper, we propose STRUG, an online POMDP solver capable of handling domains that require long-horizon planning with significant task-relevant and task-irrelevant uncertainty. We demonstrate our solution on several temporally extended versions of toy POMDP problems as well as robotic manipulation of articulated objects using a neural perception frontend to construct a distribution of possible models. Our results show that STRUG outperforms the current sample-based online POMDP solvers on several tasks.

translated by 谷歌翻译