智能论文笔记

A Cross Validation framework for Signal Denoising with Applications to Trend Filtering, Dyadic CART and Beyond

Anamitra Chaudhuri , Sabyasachi Chatterjee

分类： (统计)机器学习

2022-01-07

本文为信号去噪提供了一般交叉验证框架。然后将一般框架应用于非参数回归方法，例如趋势过滤和二元推车。然后显示所得到的交叉验证版本以获得最佳调谐的类似物所熟知的几乎相同的收敛速度。没有任何先前的趋势过滤或二元推车的理论分析。为了说明框架的一般性，我们还提出并研究了两个基本估算器的交叉验证版本;套索用于高维线性回归和矩阵估计的奇异值阈值阈值。我们的一般框架是由Chatterjee和Jafarov（2015）的想法的启发，并且可能适用于使用调整参数的广泛估算方法。

translated by 谷歌翻译

End-to-end AI Framework for Hyperparameter Optimization, Model Training, and Interpretable Inference for Molecules and Crystals

Hyun Park , Ruijie Zhu , E. A. Huerta , Santanu Chaudhuri , Emad Tajkhorshid , Donny Cooper

分类：人工智能 | 机器学习

2022-12-21

We introduce an end-to-end computational framework that enables hyperparameter optimization with the DeepHyper library, accelerated training, and interpretable AI inference with a suite of state-of-the-art AI models, including CGCNN, PhysNet, SchNet, MPNN, MPNN-transformer, and TorchMD-Net. We use these AI models and the benchmark QM9, hMOF, and MD17 datasets to showcase the prediction of user-specified materials properties in modern computing environments, and to demonstrate translational applications for the modeling of small molecules, crystals and metal organic frameworks with a unified, stand-alone framework. We deployed and tested this framework in the ThetaGPU supercomputer at the Argonne Leadership Computing Facility, and the Delta supercomputer at the National Center for Supercomputing Applications to provide researchers with modern tools to conduct accelerated AI-driven discovery in leadership class computing environments.

translated by 谷歌翻译

Learning Inter-Annual Flood Loss Risk Models From Historical Flood Insurance Claims and Extreme Rainfall Data

Joaquin Salas , Anamitra Saha , Sai Ravela

分类：机器学习 | (统计)机器学习

2022-12-15

Flooding is one of the most disastrous natural hazards, responsible for substantial economic losses. A predictive model for flood-induced financial damages is useful for many applications such as climate change adaptation planning and insurance underwriting. This research assesses the predictive capability of regressors constructed on the National Flood Insurance Program (NFIP) dataset using neural networks (Conditional Generative Adversarial Networks), decision trees (Extreme Gradient Boosting), and kernel-based regressors (Gaussian Process). The assessment highlights the most informative predictors for regression. The distribution for claims amount inference is modeled with a Burr distribution permitting the introduction of a bias correction scheme and increasing the regressor's predictive capability. Aiming to study the interaction with physical variables, we incorporate Daymet rainfall estimation to NFIP as an additional predictor. A study on the coastal counties in the eight US South-West states resulted in an $R^2=0.807$. Further analysis of 11 counties with a significant number of claims in the NFIP dataset reveals that Extreme Gradient Boosting provides the best results, that bias correction significantly improves the similarity with the reference distribution, and that the rainfall predictor strengthens the regressor performance.

translated by 谷歌翻译

Nostradamus: Weathering Worth

Alapan Chaudhuri , Zeeshan Ahmed , Ashwin Rao , Shivansh Subramanian , Shreyas Pradhan , Abhishek Mittal

分类：机器学习

2022-12-08

Nostradamus, inspired by the French astrologer and reputed seer, is a detailed study exploring relations between environmental factors and changes in the stock market. In this paper, we analyze associative correlation and causation between environmental elements and stock prices based on the US financial market, global climate trends, and daily weather records to demonstrate significant relationships between climate and stock price fluctuation. Our analysis covers short and long-term rises and dips in company stock performances. Lastly, we take four natural disasters as a case study to observe their effect on the emotional state of people and their influence on the stock market.

translated by 谷歌翻译

High-Speed State Estimation in Power Systems with Extreme Unobservability Using Machine Learning

Antos Cheeramban Varghese , Hritik Shah , Behrouz Azimian , Anamitra Pal , Evangelos Farantatos , Mahendra Patel , Paul Myrda

分类：机器学习

2022-12-04

Fast timescale state estimation for a large power system can be challenging if the sensors producing the measurements are few in number. This is particularly true for doing time-synchronized state estimation for a transmission system that has minimal phasor measurement unit (PMU) coverage. This paper proposes a Deep Neural network-based State Estimator (DeNSE) to overcome this extreme unobservability problem. For systems in which the existing PMU infrastructure is not able to bring the estimation errors within acceptable limits using the DeNSE, a data-driven incremental PMU placement methodology is also introduced. The practical utility of the proposed approach is demonstrated by considering topology changes, non-Gaussian measurement noise, bad data detection and correction, and large system application.

translated by 谷歌翻译

Downscaling Extreme Rainfall Using Physical-Statistical Generative Adversarial Learning

Anamitra Saha , Sai Ravela

分类：机器学习

2022-12-02

Modeling the risk of extreme weather events in a changing climate is essential for developing effective adaptation and mitigation strategies. Although the available low-resolution climate models capture different scenarios, accurate risk assessment for mitigation and adaption often demands detail that they typically cannot resolve. Here, we develop a dynamic data-driven downscaling (super-resolution) method that incorporates physics and statistics in a generative framework to learn the fine-scale spatial details of rainfall. Our method transforms coarse-resolution ($0.25^{\circ} \times 0.25^{\circ}$) climate model outputs into high-resolution ($0.01^{\circ} \times 0.01^{\circ}$) rainfall fields while efficaciously quantifying uncertainty. Results indicate that the downscaled rainfall fields closely match observed spatial fields and their risk distributions.

translated by 谷歌翻译

The Interpolated MVU Mechanism For Communication-efficient Private Federated Learning

Chuan Guo , Kamalika Chaudhuri , Pierre Stock , Mike Rabbat

分类：机器学习

2022-11-08

We consider private federated learning (FL), where a server aggregates differentially private gradient updates from a large number of clients in order to train a machine learning model. The main challenge is balancing privacy with both classification accuracy of the learned model as well as the amount of communication between the clients and server. In this work, we build on a recently proposed method for communication-efficient private FL -- the MVU mechanism -- by introducing a new interpolation mechanism that can accommodate a more efficient privacy analysis. The result is the new Interpolated MVU mechanism that provides SOTA results on communication-efficient private FL on a variety of datasets.

translated by 谷歌翻译

Prototypical quadruplet for few-shot class incremental learning

Sanchar Palit , Biplab Banerjee , Subhasis Chaudhuri

分类：计算机视觉 | 机器学习

2022-11-05

Many modern computer vision algorithms suffer from two major bottlenecks: scarcity of data and learning new tasks incrementally. While training the model with new batches of data the model looses it's ability to classify the previous data judiciously which is termed as catastrophic forgetting. Conventional methods have tried to mitigate catastrophic forgetting of the previously learned data while the training at the current session has been compromised. The state-of-the-art generative replay based approaches use complicated structures such as generative adversarial network (GAN) to deal with catastrophic forgetting. Additionally, training a GAN with few samples may lead to instability. In this work, we present a novel method to deal with these two major hurdles. Our method identifies a better embedding space with an improved contrasting loss to make classification more robust. Moreover, our approach is able to retain previously acquired knowledge in the embedding space even when trained with new classes. We update previous session class prototypes while training in such a way that it is able to represent the true class mean. This is of prime importance as our classification rule is based on the nearest class mean classification strategy. We have demonstrated our results by showing that the embedding space remains intact after training the model with new classes. We showed that our method preformed better than the existing state-of-the-art algorithms in terms of accuracy across different sessions.

translated by 谷歌翻译

Neurosymbolic Programming for Science

Jennifer J. Sun , Megan Tjandrasuwita , Atharva Sehgal , Armando Solar-Lezama , Swarat Chaudhuri , Yisong Yue , Omar Costilla-Reyes

分类：人工智能

2022-10-10

Neurosymbolic Programming (NP) techniques have the potential to accelerate scientific discovery. These models combine neural and symbolic components to learn complex patterns and representations from data, using high-level concepts or known constraints. NP techniques can interface with symbolic domain knowledge from scientists, such as prior knowledge and experimental context, to produce interpretable outputs. We identify opportunities and challenges between current NP models and scientific workflows, with real-world examples from behavior analysis in science: to enable the use of NP broadly for workflows across the natural and social sciences.

translated by 谷歌翻译

Guiding Safe Exploration with Weakest Preconditions

Greg Anderson , Swarat Chaudhuri , Isil Dillig

分类：机器学习

2022-09-28

在对关键安全环境的强化学习中，通常希望代理在所有时间点（包括培训期间）服从安全性限制。我们提出了一种称为Spice的新型神经符号方法，以解决这个安全的探索问题。与现有工具相比，Spice使用基于符号最弱的先决条件的在线屏蔽层获得更精确的安全性分析，而不会不适当地影响培训过程。我们在连续控制基准的套件上评估了该方法，并表明它可以达到与现有的安全学习技术相当的性能，同时遭受较少的安全性违规行为。此外，我们提出的理论结果表明，在合理假设下，香料会收敛到最佳安全政策。

translated by 谷歌翻译