智能论文笔记

Robotics CTF (RCTF), a playground for robot hacking

Gorka Olalde Mendia , Lander Usategui San Juan , Xabier Perez Bascaran , Asier Bilbao Calvo , Alejandro Hernández Cordero , Irati Zamalloa Ugarte , Aday Muñiz Rosas , David Mayoral Vilches , Unai Ayucar Carbajo , Laura Alzola Kirschgens

分类：机器人

2018-10-01

机器人的不安全状态是在舞台上。有关于主要机器人脆弱性及其不利后果的新兴担忧。但是，机器人和网络安全域之间仍有相当大的差距。为了填补这种差距，目前的技术报告提供了机器人CTF（RCTF），一个在线游乐场，用于从任何浏览器中挑战机器人安全性。我们描述了RCTF的架构，并提供了9个方案，黑客可以挑战不同机器人设置的安全性。我们的工作使安全研究人员提供给a）本地复制虚拟机器人方案，b）将网络设置改为模拟真实机器人目标。我们倡导机器人中的黑客动力安全，并通过开放采购我们的场景贡献。

translated by 谷歌翻译

Multimodal Wildland Fire Smoke Detection

Siddhant Baldota , Shreyas Anantha Ramaprasad , Jaspreet Kaur Bhamra , Shane Luna , Ravi Ramachandra , Eugene Zen , Harrison Kim , Daniel Crawl , Ismael Perez , Ilkay Altintas

分类：计算机视觉

2022-12-29

Research has shown that climate change creates warmer temperatures and drier conditions, leading to longer wildfire seasons and increased wildfire risks in the United States. These factors have in turn led to increases in the frequency, extent, and severity of wildfires in recent years. Given the danger posed by wildland fires to people, property, wildlife, and the environment, there is an urgency to provide tools for effective wildfire management. Early detection of wildfires is essential to minimizing potentially catastrophic destruction. In this paper, we present our work on integrating multiple data sources in SmokeyNet, a deep learning model using spatio-temporal information to detect smoke from wildland fires. Camera image data is integrated with weather sensor measurements and processed by SmokeyNet to create a multimodal wildland fire smoke detection system. We present our results comparing performance in terms of both accuracy and time-to-detection for multimodal data vs. a single data source. With a time-to-detection of only a few minutes, SmokeyNet can serve as an automated early notification system, providing a useful tool in the fight against destructive wildfires.

translated by 谷歌翻译

Discovering Language Model Behaviors with Model-Written Evaluations

Ethan Perez , Sam Ringer , Kamilė Lukošiūtė , Karina Nguyen , Edwin Chen , Scott Heiner , Craig Pettit , Catherine Olsson , Sandipan Kundu , Saurav Kadavath

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-19

As language models (LMs) scale, they develop many novel behaviors, good and bad, exacerbating the need to evaluate how they behave. Prior work creates evaluations with crowdwork (which is time-consuming and expensive) or existing data sources (which are not always available). Here, we automatically generate evaluations with LMs. We explore approaches with varying amounts of human effort, from instructing LMs to write yes/no questions to making complex Winogender schemas with multiple stages of LM-based generation and filtering. Crowdworkers rate the examples as highly relevant and agree with 90-100% of labels, sometimes more so than corresponding human-written datasets. We generate 154 datasets and discover new cases of inverse scaling where LMs get worse with size. Larger LMs repeat back a dialog user's preferred answer ("sycophancy") and express greater desire to pursue concerning goals like resource acquisition and goal preservation. We also find some of the first examples of inverse scaling in RL from Human Feedback (RLHF), where more RLHF makes LMs worse. For example, RLHF makes LMs express stronger political views (on gun rights and immigration) and a greater desire to avoid shut down. Overall, LM-written evaluations are high-quality and let us quickly discover many novel LM behaviors.

translated by 谷歌翻译

Constitutional AI: Harmlessness from AI Feedback

Yuntao Bai , Saurav Kadavath , Sandipan Kundu , Amanda Askell , Jackson Kernion , Andy Jones , Anna Chen , Anna Goldie , Azalia Mirhoseini , Cameron McKinnon

分类：自然语言处理 | 人工智能

2022-12-15

As AI systems become more capable, we would like to enlist their help to supervise other AIs. We experiment with methods for training a harmless AI assistant through self-improvement, without any human labels identifying harmful outputs. The only human oversight is provided through a list of rules or principles, and so we refer to the method as 'Constitutional AI'. The process involves both a supervised learning and a reinforcement learning phase. In the supervised phase we sample from an initial model, then generate self-critiques and revisions, and then finetune the original model on revised responses. In the RL phase, we sample from the finetuned model, use a model to evaluate which of the two samples is better, and then train a preference model from this dataset of AI preferences. We then train with RL using the preference model as the reward signal, i.e. we use 'RL from AI Feedback' (RLAIF). As a result we are able to train a harmless but non-evasive AI assistant that engages with harmful queries by explaining its objections to them. Both the SL and RL methods can leverage chain-of-thought style reasoning to improve the human-judged performance and transparency of AI decision making. These methods make it possible to control AI behavior more precisely and with far fewer human labels.

translated by 谷歌翻译

Reinforcement Learning in System Identification

Jose Antonio Martin H. , Oscar Fernandez Vicente , Sergio Perez , Anas Belfadil , Cristina Ibanez-Llano , Freddy Jose Perozo Rondon , Jose Javier Valle , Javier Arechalde Pelaz

分类：机器学习 | 人工智能

2022-12-14

System identification, also known as learning forward models, transfer functions, system dynamics, etc., has a long tradition both in science and engineering in different fields. Particularly, it is a recurring theme in Reinforcement Learning research, where forward models approximate the state transition function of a Markov Decision Process by learning a mapping function from current state and action to the next state. This problem is commonly defined as a Supervised Learning problem in a direct way. This common approach faces several difficulties due to the inherent complexities of the dynamics to learn, for example, delayed effects, high non-linearity, non-stationarity, partial observability and, more important, error accumulation when using bootstrapped predictions (predictions based on past predictions), over large time horizons. Here we explore the use of Reinforcement Learning in this problem. We elaborate on why and how this problem fits naturally and sound as a Reinforcement Learning problem, and present some experimental results that demonstrate RL is a promising technique to solve these kind of problems.

translated by 谷歌翻译

RT-1: Robotics Transformer for Real-World Control at Scale

Anthony Brohan , Noah Brown , Justice Carbajal , Yevgen Chebotar , Joseph Dabis , Chelsea Finn , Keerthana Gopalakrishnan , Karol Hausman , Alex Herzog , Jasmine Hsu

分类：机器人 | 人工智能 | 自然语言处理 | 计算机视觉 | 机器学习

2022-12-13

By transferring knowledge from large, diverse, task-agnostic datasets, modern machine learning models can solve specific downstream tasks either zero-shot or with small task-specific datasets to a high level of performance. While this capability has been demonstrated in other fields such as computer vision, natural language processing or speech recognition, it remains to be shown in robotics, where the generalization capabilities of the models are particularly critical due to the difficulty of collecting real-world robotic data. We argue that one of the keys to the success of such general robotic models lies with open-ended task-agnostic training, combined with high-capacity architectures that can absorb all of the diverse, robotic data. In this paper, we present a model class, dubbed Robotics Transformer, that exhibits promising scalable model properties. We verify our conclusions in a study of different model classes and their ability to generalize as a function of the data size, model size, and data diversity based on a large-scale data collection on real robots performing real-world tasks. The project's website and videos can be found at robotics-transformer.github.io

translated by 谷歌翻译

Targeted Adversarial Attacks on Deep Reinforcement Learning Policies via Model Checking

Dennis Gross , Thiago D. Simao , Nils Jansen , Guillermo A. Perez

分类：机器学习

2022-12-10

Deep Reinforcement Learning (RL) agents are susceptible to adversarial noise in their observations that can mislead their policies and decrease their performance. However, an adversary may be interested not only in decreasing the reward, but also in modifying specific temporal logic properties of the policy. This paper presents a metric that measures the exact impact of adversarial attacks against such properties. We use this metric to craft optimal adversarial attacks. Furthermore, we introduce a model checking method that allows us to verify the robustness of RL policies against adversarial attacks. Our empirical analysis confirms (1) the quality of our metric to craft adversarial attacks against temporal logic properties, and (2) that we are able to concisely assess a system's robustness against attacks.

translated by 谷歌翻译

Album cover art image generation with Generative Adversarial Networks

Felipe Perez Stoppa , Ester Vidaña-Vila , Joan Navarro

分类：计算机视觉 | 人工智能 | 机器学习

2022-12-09

Generative Adversarial Networks (GANs) were introduced by Goodfellow in 2014, and since then have become popular for constructing generative artificial intelligence models. However, the drawbacks of such networks are numerous, like their longer training times, their sensitivity to hyperparameter tuning, several types of loss and optimization functions and other difficulties like mode collapse. Current applications of GANs include generating photo-realistic human faces, animals and objects. However, I wanted to explore the artistic ability of GANs in more detail, by using existing models and learning from them. This dissertation covers the basics of neural networks and works its way up to the particular aspects of GANs, together with experimentation and modification of existing available models, from least complex to most. The intention is to see if state of the art GANs (specifically StyleGAN2) can generate album art covers and if it is possible to tailor them by genre. This was attempted by first familiarizing myself with 3 existing GANs architectures, including the state of the art StyleGAN2. The StyleGAN2 code was used to train a model with a dataset containing 80K album cover images, then used to style images by picking curated images and mixing their styles.

translated by 谷歌翻译

Machine-Learned Exclusion Limits without Binning

Ernesto Arganda , Andres D. Perez , Martin de los Rios , Rosa María Sandá Seoane

分类：机器学习 | (统计)机器学习

2022-11-09

Machine-Learned Likelihoods (MLL) is a method that, by combining modern machine-learning classification techniques with likelihood-based inference tests, allows to estimate the experimental sensitivity of high-dimensional data sets. We extend the MLL method by including the exclusion hypothesis tests and show that the addition of Kernel Density Estimators avoids the need to bin the classifier output in order to extract the resulting one-dimensional signal and background probability density functions. We first test our method on toy models generated with multivariate Gaussian distributions, where the true probability distribution functions are known. We then apply it to a case of interest in the search for new physics at the HL-LHC, in which a $Z^\prime$ boson decays into lepton pairs, comparing the performance of our method for estimating 95\% CL exclusion limits to the results obtained applying a binned likelihood to the machine-learning classifier output.

translated by 谷歌翻译

Measuring Progress on Scalable Oversight for Large Language Models

Samuel R. Bowman , Jeeyoon Hyun , Ethan Perez , Edwin Chen , Craig Pettit , Scott Heiner , Kamile Lukosuite , Amanda Askell , Andy Jones , Anna Chen

分类：人工智能 | 自然语言处理

2022-11-04

Developing safe and useful general-purpose AI systems will require us to make progress on scalable oversight: the problem of supervising systems that potentially outperform us on most skills relevant to the task at hand. Empirical work on this problem is not straightforward, since we do not yet have systems that broadly exceed our abilities. This paper discusses one of the major ways we think about this problem, with a focus on how to turn it into one that can be productively studied empirically. We first present an experimental design centered on choosing tasks for which human specialists succeed but unaided humans and current general AI systems fail. We then present a proof-of-concept experiment following meant to demonstrate a key feature of this experimental design and show its viability with two question-answering tasks: MMLU and time-limited QuALITY. On these tasks, we find that human participants who interact with an unreliable large-language-model dialog assistant through chat -- a trivial baseline strategy for scalable oversight -- substantially outperform both the model alone and their own unaided performance. These results are an encouraging sign that scalable oversight will be tractable to study with present models and bolster recent findings that large language models can productively assist humans with difficult tasks.

translated by 谷歌翻译