智能论文笔记

Keke AI Competition: Solving puzzle levels in a dynamically changing mechanic space

M Charity , Julian Togelius

分类：人工智能

2022-09-11

Keke AI竞赛介绍了游戏Baba的人造代理竞赛是您 - 像索托班一样的益智游戏，玩家可以创建影响游戏机制的规则。更改规则可能会导致可能是解决方案空间的一部分的其余级别的暂时或永久效应。这些动态规则的性质和游戏的确定性方面为AI构成了一个挑战，即适应各种机械组合以解决一个水平。本文介绍了用于对提交代理进行排名的框架和评估指标，以及样本搜索剂的基线结果。

translated by 谷歌翻译

Mech-Elites: Illuminating the Mechanic Space of GVGAI

M Charity , Michael Cerny Green , Ahmed Khalifa , Julian Togelius

分类：人工智能

2020-02-11

本文介绍了一种全自动的机械照明方法，以实现一般视频游戏水平的生成。使用受约束的MAP-ELITE算法和GVG-AI框架，该系统生成了最简单的基于图块的级别，该级别包含特定的游戏机制集并满足可玩性约束。我们将这种方法应用于GVG-AI的$ 4 $不同游戏的机械空间：Zelda，Solarfox，Plants和eartortals。

translated by 谷歌翻译

Monte Carlo Tree Search: A Review of Recent Modifications and Applications

Maciej Świechowski , Konrad Godlewski , Bartosz Sawicki , Jacek Mańdziuk

分类：人工智能 | 机器学习

2021-03-08

蒙特卡洛树搜索（MCT）是设计游戏机器人或解决顺序决策问题的强大方法。该方法依赖于平衡探索和开发的智能树搜索。MCT以模拟的形式进行随机抽样，并存储动作的统计数据，以在每个随后的迭代中做出更有教育的选择。然而，该方法已成为组合游戏的最新技术，但是，在更复杂的游戏（例如那些具有较高的分支因素或实时系列的游戏）以及各种实用领域（例如，运输，日程安排或安全性）有效的MCT应用程序通常需要其与问题有关的修改或与其他技术集成。这种特定领域的修改和混合方法是本调查的主要重点。最后一项主要的MCT调查已于2012年发布。自发布以来出现的贡献特别感兴趣。

translated by 谷歌翻译

Griddly: A platform for AI research in games

Chris Bamford , Shengyi Huang , Simon Lucas

分类：人工智能

2020-11-12

近年来，游戏AI研究取得了巨大的突破，尤其是在增强学习（RL）中。尽管他们成功了，但基础游戏通常是通过自己的预设环境和游戏机制实现的，因此使研究人员难以创建不同的游戏环境。但是，测试RL代理对各种游戏环境的测试对于最近努力研究RL的概括并避免可能发生过度拟合的问题至关重要。在本文中，我们将Gridd呈现为游戏AI研究的新平台，该平台提供了高度可配置的游戏，不同的观察者类型和有效的C ++核心引擎的独特组合。此外，我们提出了一系列基线实验，以研究RL剂的不同观察构构和泛化能力的影响。

translated by 谷歌翻译

Teamwork under extreme uncertainty: AI for Pokemon ranks 33rd in the world

Nicholas R. Sarantinos

分类：人工智能

2022-12-27

The highest grossing media franchise of all times, with over \$90 billion in total revenue, is Pokemon. The video games belong to the class of Japanese Role Playing Games (J-RPG). Developing a powerful AI agent for these games is very hard because they present big challenges to MinMax, Monte Carlo Tree Search and statistical Machine Learning, as they are vastly different from the well explored in AI literature games. An AI agent for one of these games means significant progress in AI agents for the entire class. Further, the key principles of such work can hopefully inspire approaches to several domains that require excellent teamwork under conditions of extreme uncertainty, including managing a team of doctors, robots or employees in an ever changing environment, like a pandemic stricken region or a war-zone. In this paper we first explain the mechanics of the game and we perform a game analysis. We continue by proposing unique AI algorithms based on our understanding that the two biggest challenges in the game are keeping a balanced team and dealing with three sources of uncertainty. Later on, we describe why evaluating the performance of such agents is challenging and we present the results of our approach. Our AI agent performed significantly better than all previous attempts and peaked at the 33rd place in the world, in one of the most popular battle formats, while running on only 4 single socket servers.

translated by 谷歌翻译

A survey of monte carlo tree search methods

分类：

Monte Carlo Tree Search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarise the results from the key game and non-game domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work.

translated by 谷歌翻译

Developing, Evaluating and Scaling Learning Agents in Multi-Agent Environments

Ian Gemp , Thomas Anthony , Yoram Bachrach , Avishkar Bhoopchand , Kalesha Bullard , Jerome Connor , Vibhavari Dasagi , Bart De Vylder , Edgar Duenez-Guzman , Romuald Elie

分类：人工智能

2022-09-22

DeepMind的游戏理论与多代理团队研究多学科学习的几个方面，从计算近似值到游戏理论中的基本概念，再到在富裕的空间环境中模拟社会困境，并在困难的团队协调任务中培训3-D类人动物。我们小组的一个签名目的是使用DeepMind在DeepMind中提供的资源和专业知识，以深入强化学习来探索复杂环境中的多代理系统，并使用这些基准来提高我们的理解。在这里，我们总结了我们团队的最新工作，并提出了一种分类法，我们认为这重点介绍了多代理研究中许多重要的开放挑战。

translated by 谷歌翻译

Reinforcement Learning with Dual-Observation for General Video Game Playing

Chengpeng Hu , Ziqi Wang , Tianye Shu , Hao Tong , Julian Togelius , Xin Yao , Jialin Liu

分类：人工智能

2020-11-11

强化学习算法在竞争挑战板和视频游戏时表现良好。越来越多的研究工作侧重于提高加强学习算法的泛化能力。普通视频游戏AI学习竞赛旨在设计能够学习在培训期间出现不同游戏水平的代理商。本文总结了五年的一般视频游戏AI学习竞争。在每个版本，设计了三场新游戏。对于每场比赛，通过扰动或组合两个训练水平来产生三个测试水平。然后，我们提出了一种新颖的加强学习框架，对一般视频游戏的双程观察，在假设中，它更有可能在不同级别而不是全局信息中观察到类似的本地信息。因此，我们所提出的框架而不是直接输入基于目前游戏屏幕的单个原始像素的屏幕截图，而是将游戏屏幕的编码，转换的全局和本地观测视为两个同时输入，旨在学习播放新级别的本地信息。我们提出的框架是用三种最先进的加强学习算法实施，并在2020年普通视频游戏AI学习竞赛的游戏集上进行了测试。消融研究表明，使用编码，转换的全局和本地观察的出色性能。总体上最好的代理商进一步用作2021次竞赛版的基线。

translated by 谷歌翻译

Generating Real-Time Strategy Game Units Using Search-Based Procedural Content Generation and Monte Carlo Tree Search

Kynan Sorochan , Matthew Guzdial

分类：人工智能

2022-12-07

Real-Time Strategy (RTS) game unit generation is an unexplored area of Procedural Content Generation (PCG) research, which leaves the question of how to automatically generate interesting and balanced units unanswered. Creating unique and balanced units can be a difficult task when designing an RTS game, even for humans. Having an automated method of designing units could help developers speed up the creation process as well as find new ideas. In this work we propose a method of generating balanced and useful RTS units. We draw on Search-Based PCG and a fitness function based on Monte Carlo Tree Search (MCTS). We present ten units generated by our system designed to be used in the game microRTS, as well as results demonstrating that these units are unique, useful, and balanced.

translated by 谷歌翻译

The Arcade Learning Environment: An Evaluation Platform for General Agents

Marc G. Bellemare , Yavar Naddaf , Joel Veness , Michael Bowling

分类：

2012-07-19

In this article we introduce the Arcade Learning Environment (ALE): both a challenge problem and a platform and methodology for evaluating the development of general, domain-independent AI technology. ALE provides an interface to hundreds of Atari 2600 game environments, each one different, interesting, and designed to be a challenge for human players. ALE presents significant research challenges for reinforcement learning, model learning, model-based planning, imitation learning, transfer learning, and intrinsic motivation. Most importantly, it provides a rigorous testbed for evaluating and comparing approaches to these problems. We illustrate the promise of ALE by developing and benchmarking domain-independent agents designed using well-established AI techniques for both reinforcement learning and planning. In doing so, we also propose an evaluation methodology made possible by ALE, reporting empirical results on over 55 different games. All of the software, including the benchmark agents, is publicly available.

translated by 谷歌翻译

AI in Games: Techniques, Challenges and Opportunities

Qiyue Yin , Jun Yang , Wancheng Ni , Bin Liang , Kaiqi Huang

分类：人工智能

2021-11-15

随着alphago的突破，人机游戏的AI已经成为一个非常热门的话题，吸引了世界各地的研究人员，这通常是测试人工智能的有效标准。已经开发了各种游戏AI系统（AIS），如Plibratus，Openai Five和AlphaStar，击败了专业人员。在本文中，我们调查了最近的成功游戏AIS，覆盖棋盘游戏AIS，纸牌游戏AIS，第一人称射击游戏AIS和实时战略游戏AIS。通过这项调查，我们1）比较智能决策领域的不同类型游戏之间的主要困难; 2）说明了开发专业水平AIS的主流框架和技术; 3）提高当前AIS中的挑战或缺点，以实现智能决策; 4）试图提出奥运会和智能决策技巧的未来趋势。最后，我们希望这篇简短的审查可以为初学者提供介绍，激发了在游戏中AI提交的研究人员的见解。

translated by 谷歌翻译

DareFightingICE Competition: A Fighting Game Sound Design and AI Competition

Ibrahim Khan , Thai Van Nguyen , Xincheng Dai , Ruck Thawonmas

分类：人工智能

2022-03-03

本文在2022年的IEEE游戏会议（COG）上提出了一项新的比赛，称为Darefightinginge比赛。比赛有两个曲目：声音设计轨道和AI轨道。该竞赛的游戏平台也称为格斗游戏平台Darefightinging。 DareFightingIce是一种声音设计的战斗版本，在COG的一场比赛中以前使用，直到2021年，用于促进格斗游戏中的人工智能（AI）研究。在声音设计轨道中，鉴于DareFightingIce的默认声音设计作为样本，参与者竞争最佳声音设计，我们将声音设计定义为一组声音效果，并结合了实现其正时控制算法的源代码。要求AI轨道的参与者开发其AI算法，该算法控制角色仅作为输入（Blind AI）与对手作战；我们将提供深度学习的盲人AI。我们还描述了我们最大程度地提高两个轨道之间的协同作用的手段。这项比赛为视觉受损的玩家提供了有效的声音设计，游戏社区中的一个小组大多被忽略了。据我们所知，Darefightingingice竞赛是COG内外的首次此类竞赛。

translated by 谷歌翻译

Planning from video game descriptions

Ignacio Vellido , Carlos Núñez-Molina , Vladislav Nikolov , Juan Fdez-Olivares

分类：人工智能

2021-09-01

该项目提出了一种自动生成视频游戏动态描述的动作模型的方法，以及与计划代理的集成，以执行和监控计划。规划者使用这些动作模型来获得许多不同视频游戏中的代理的审议行为，并与反应模块组合，解决确定性和无确定级别。实验结果验证了该方法的方法，并证明了知识工程师的努力在这种复杂域的定义中可以大大减少。此外，域名的基准已经制定，这可能对国际规划社会评估国际规划竞赛中的规划者感兴趣。

translated by 谷歌翻译

Beyond Value: CHECKLIST for Testing Inferences in Planning-Based RL

Kin-Ho Lam , Delyar Tabatabai , Jed Irvine , Donald Bertucci , Anita Ruangrotsakun , Minsuk Kahng , Alan Fern

分类：人工智能 | 机器学习

2022-06-04

加强学习（RL）代理通常通过其预期值在测试方案的分布中进行评估。不幸的是，这种评估方法为超出测试分布以外的部署后概括提供了有限的证据。在本文中，我们通过将最新的清单测试方法从自然语言处理扩展到基于计划的RL来解决此限制。具体而言，我们考虑使用学习过渡模型和价值功能通过在线树搜索做出决策的RL代理。关键思想是通过清单方法来改善对未来绩效的评估，以探索和评估树木搜索过程中代理商的推论。该方法为用户提供了界面和一般查询规则机制，用于识别潜在的推理缺陷并验证预期的推理不变。我们介绍了一项涉及知识渊博的AI研究人员的用户研究，使用该方法评估训练有素的代理商，可以玩复杂的实时策略游戏。结果表明，该方法有效地允许用户识别代理推理中以前未知的缺陷。此外，我们的分析提供了有关AI专家如何使用这种测试方法的见解，这可能有助于改善未来的实例。

translated by 谷歌翻译

Adapting Procedural Content Generation to Player Personas Through Evolution

Pedro M. Fernandes , Jonathan Jørgensen , Niels N. T. G. Poldervaart

分类：人工智能

2021-12-07

自动适应玩家的游戏内容打开新的游戏开发门。在本文中，我们提出了一种使用人物代理和经验指标的架构，这使得能够在进行针对特定玩家人物的程序生成的水平。使用我们的游戏“Grave Rave”，我们证明了这种方法成功地适应了三个不同的三种不同体验指标的基于法则的角色代理。此外，该适应性被证明是特定的，这意味着水平是人的意识，而不仅仅是关于所选度量的一般优化。

translated by 谷歌翻译

Hearts Gym: Learning Reinforcement Learning as a Team Event

Jan Ebert , Danimir T. Doncevic , Ramona Kloß , Stefan Kesselheim

分类：人工智能 | 机器学习

2022-09-07

在Covid-19大流行中，本文的作者为数据科学领域的一所研究生院组织了一门加强学习（RL）课程。我们描述了尽管无处不在的变焦疲劳，但仍在定性地评估课程，以创造令人兴奋的学习体验的策略和材料。关键的组织特征是专注于团队中竞争性的动手设置，并提供了最少的讲座，从而提供了RL基本背景。该课程的实用部分围绕着Hearts Gym，这是我们作为RL的入门级教程开发的RL环境。参与者的任务是培训代理人探索奖励成型和其他RL超参数。为了进行最终评估，参与者的代理人相互竞争。

translated by 谷歌翻译

MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

Mikayel Samvelyan , Robert Kirk , Vitaly Kurin , Jack Parker-Holder , Minqi Jiang , Eric Hambro , Fabio Petroni , Heinrich Küttler , Edward Grefenstette , Tim Rocktäschel

分类：机器学习 | (统计)机器学习

2021-09-27

深度强化学习（RL）的进展是通过用于培训代理商的具有挑战性的基准的可用性来驱动。但是，社区广泛采用的基准未明确设计用于评估RL方法的特定功能。虽然存在用于评估RL的特定打开问题的环境（例如探索，转移学习，无监督环境设计，甚至语言辅助RL），但一旦研究超出证明，通常难以将这些更富有，更复杂的环境 - 概念结果。我们展示了一个强大的沙箱框架，用于易于设计新颖的RL环境。 Minihack是一个停止商店，用于RL实验，环境包括从小房间到复杂的，程序生成的世界。通过利用来自Nethack的全套实体和环境动态，MiniHack是最富有的基网上的视频游戏之一，允许设计快速方便的定制RL测试台。使用这种沙箱框架，可以轻松设计新颖的环境，可以使用人类可读的描述语言或简单的Python接口来设计。除了各种RL任务和基线外，Minihack还可以包装现有的RL基准，并提供无缝添加额外复杂性的方法。

translated by 谷歌翻译

A Fast Evolutionary adaptation for MCTS in Pommerman

Harsh Panwar , Saswata Chatterjee , Wil Dube

分类：人工智能

2021-11-26

人工智能，当与游戏进行合并时，使研究和推进领域的理想结构。多种代理游戏对每个代理具有多个控件，同时增加搜索复杂性的同时生成大量数据。因此，我们需要高级搜索方法来查找解决方案并创建人工智能代理。在本文中，我们提出了我们的小说进化蒙特卡罗树搜索（FEMCTS）代理商，借用从进化的Algorthims（EA）和Monte Carlo树搜索（MCT）的想法来玩Pommerman的比赛。它优于滚动地平线进化算法（Rhea）在高可观察性环境中显着，几乎和MCTS用于大多数游戏种子，在某些情况下表现优于它。

translated by 谷歌翻译

Diversity-based Deep Reinforcement Learning Towards Multidimensional Difficulty for Fighting Game AI

Emily Halina , Matthew Guzdial

分类：机器学习

2022-11-04

In fighting games, individual players of the same skill level often exhibit distinct strategies from one another through their gameplay. Despite this, the majority of AI agents for fighting games have only a single strategy for each "level" of difficulty. To make AI opponents more human-like, we'd ideally like to see multiple different strategies at each level of difficulty, a concept we refer to as "multidimensional" difficulty. In this paper, we introduce a diversity-based deep reinforcement learning approach for generating a set of agents of similar difficulty that utilize diverse strategies. We find this approach outperforms a baseline trained with specialized, human-authored reward functions in both diversity and performance.

translated by 谷歌翻译

Split Moves for Monte-Carlo Tree Search

Jakub Kowalski , Maksymilian Mika , Wojciech Pawlik , Jakub Sutowicz , Marek Szykuła , Mark H. M. Winands

分类：人工智能

2021-12-14

在许多游戏中，动作包括玩家制作的若干决定。这些决定可以被视为单独的动作，这在效率原因的多动作游戏中已经是一个常见的做法。播放器的这种划分进入一系列更简单/较低级别的移动，称为\ emph {拆分}。到目前为止，分裂移动已仅在顾问的直接案件中应用，此外，几乎没有研究揭示其对代理商的影响力量的影响。采取知识的视角，我们的目标是回答如何在Monte-Carlo树搜索（MCT）中有效地使用分裂移动，以及分裂设计对代理的实际影响是什么。本文提出了与任意分裂的动作有用的MCT的概括。我们设计了算法的几种变体，并尝试分别测量分离移动的影响，以分别对效率，MCT，模拟和基于动作的启发式的效率。测试是在一组棋盘游戏上进行，并使用常规的主台综合游戏进行播放形式主义进行，其中可以基于游戏的抽象描述自动派生不同粒度的分裂策略。结果以不同方式使用分流设计的代理行为概述。我们得出结论，拆分设计可能对单一以及多动作游戏有很大的利益。

translated by 谷歌翻译