智能论文笔记

The Impact of Changes in Resolution on the Persistent Homology of Images

Teresa Heiss , Sarah Tymochko , Brittany Story , Adélie Garin , Hoa Bui , Bea Bleile , Vanessa Robins

分类：计算机视觉

2021-11-10

数字图像使得在微观和宏观长度尺度上的材料特性进行定量分析，但在获取图像时选择适当的分辨率是具有挑战性的。高分辨率意味着对给定样本的图像采集和更大的数据要求，但如果分辨率太低，则可能丢失重要信息。本文研究了解决方案对持续同源性的改变的影响，一种来自拓扑数据分析的工具，在所有长度尺度上提供图像中的图像中的结构签名。给定关于函数的先前信息，对象的几何形状，或者在给定分辨率下的密度分布，我们提供了在可接受的公差内选择粗糙分辨率的方法。我们展示了用于说明性合成实例和来自多孔材料的样品的数值案例研究，其中理论界限未知。

translated by 谷歌翻译

Support vector machines and Radon's theorem

Henry Adams , Elin Farnell , Brittany Story

分类：机器学习

2020-11-01

支持向量机（SVM）是一种算法，该算法找到了超平面，最佳地将标记的数据点以$ \ mathbb {r} ^ n $分为正面和负类。该分离超平面裕度上的数据点称为支持向量。我们将支持向量的可能配置连接到Radon的定理，这提供了一组点可以分为两个类（正负）的保证，其凸壳相交。如果将正和负支持向量的凸壳投射到分离超平面上，则仅在超平面是最佳的，则投影在至少一个点中相交。此外，通过特定类型的一般位置，我们表明（a）支撑载体的投影凸船体在恰好一个点中相交，（b）支撑载体在扰动下稳定，（c）最多有$ n + 1 $支持向量，（d）每一个高达$ n + 1 $的支持向量是可能的。最后，我们执行研究预期的支持向量数及其配置的计算机模拟，用于随机生成的数据。我们观察到，随着该类型的随机生成的数据增加的距离增加，具有两个支持向量的配置成为最可能的配置。

translated by 谷歌翻译

A Comprehensive Gold Standard and Benchmark for Comics Text Detection and Recognition

Gürkan Soykan , Deniz Yuret , Tevfik Metin Sezgin

分类：自然语言处理 | 人工智能

2022-12-27

This study focuses on improving the optical character recognition (OCR) data for panels in the COMICS dataset, the largest dataset containing text and images from comic books. To do this, we developed a pipeline for OCR processing and labeling of comic books and created the first text detection and recognition datasets for western comics, called "COMICS Text+: Detection" and "COMICS Text+: Recognition". We evaluated the performance of state-of-the-art text detection and recognition models on these datasets and found significant improvement in word accuracy and normalized edit distance compared to the text in COMICS. We also created a new dataset called "COMICS Text+", which contains the extracted text from the textboxes in the COMICS dataset. Using the improved text data of COMICS Text+ in the comics processing model from resulted in state-of-the-art performance on cloze-style tasks without changing the model architecture. The COMICS Text+ dataset can be a valuable resource for researchers working on tasks including text detection, recognition, and high-level processing of comics, such as narrative understanding, character relations, and story generation. All the data and inference instructions can be accessed in https://github.com/gsoykan/comics_text_plus.

translated by 谷歌翻译

Multi-Lingual DALL-E Storytime

Noga Mudrik , Adam S. Charles

分类：自然语言处理

2022-12-22

While recent advancements in artificial intelligence (AI) language models demonstrate cutting-edge performance when working with English texts, equivalent models do not exist in other languages or do not reach the same performance level. This undesired effect of AI advancements increases the gap between access to new technology from different populations across the world. This unsought bias mainly discriminates against individuals whose English skills are less developed, e.g., non-English speakers children. Following significant advancements in AI research in recent years, OpenAI has recently presented DALL-E: a powerful tool for creating images based on English text prompts. While DALL-E is a promising tool for many applications, its decreased performance when given input in a different language, limits its audience and deepens the gap between populations. An additional limitation of the current DALL-E model is that it only allows for the creation of a few images in response to a given input prompt, rather than a series of consecutive coherent frames that tell a story or describe a process that changes over time. Here, we present an easy-to-use automatic DALL-E storytelling framework that leverages the existing DALL-E model to enable fast and coherent visualizations of non-English songs and stories, pushing the limit of the one-step-at-a-time option DALL-E currently offers. We show that our framework is able to effectively visualize stories from non-English texts and portray the changes in the plot over time. It is also able to create a narrative and maintain interpretable changes in the description across frames. Additionally, our framework offers users the ability to specify constraints on the story elements, such as a specific location or context, and to maintain a consistent style throughout the visualization.

translated by 谷歌翻译

Crowd Score: A Method for the Evaluation of Jokes using Large Language Model AI Voters as Judges

Fabricio Goes , Zisen Zhou , Piotr Sawicki , Marek Grzes , Daniel G. Brown

分类：人工智能

2022-12-21

This paper presents the Crowd Score, a novel method to assess the funniness of jokes using large language models (LLMs) as AI judges. Our method relies on inducing different personalities into the LLM and aggregating the votes of the AI judges into a single score to rate jokes. We validate the votes using an auditing technique that checks if the explanation for a particular vote is reasonable using the LLM. We tested our methodology on 52 jokes in a crowd of four AI voters with different humour types: affiliative, self-enhancing, aggressive and self-defeating. Our results show that few-shot prompting leads to better results than zero-shot for the voting question. Personality induction showed that aggressive and self-defeating voters are significantly more inclined to find more jokes funny of a set of aggressive/self-defeating jokes than the affiliative and self-enhancing voters. The Crowd Score follows the same trend as human judges by assigning higher scores to jokes that are also considered funnier by human judges. We believe that our methodology could be applied to other creative domains such as story, poetry, slogans, etc. It could both help the adoption of a flexible and accurate standard approach to compare different work in the CC community under a common metric and by minimizing human participation in assessing creative artefacts, it could accelerate the prototyping of creative artefacts and reduce the cost of hiring human participants to rate creative artefacts.

translated by 谷歌翻译

CORRPUS: Detecting Story Inconsistencies via Codex-Bootstrapped Neurosymbolic Reasoning

Yijiang River Dong , Lara J. Martin , Chris Callison-Burch

分类：自然语言处理

2022-12-21

Story generation and understanding -- as with all NLG/NLU tasks -- has seen a surge in neurosymbolic work. Researchers have recognized that, while large language models (LLMs) have tremendous utility, they can be augmented with symbolic means to be even better and to make up for any flaws that the neural networks might have. However, symbolic methods are extremely costly in terms of the amount of time and expertise needed to create them. In this work, we capitalize on state-of-the-art Code-LLMs, such as Codex, to bootstrap the use of symbolic methods for tracking the state of stories and aiding in story understanding. We show that our CoRRPUS system and abstracted prompting procedures can beat current state-of-the-art structured LLM techniques on pre-existing story understanding tasks (bAbI task 2 and Re^3) with minimal hand engineering. We hope that this work can help highlight the importance of symbolic representations and specialized prompting for LLMs as these models require some guidance for performing reasoning tasks properly.

translated by 谷歌翻译

Little Red Riding Hood Goes Around the Globe:Crosslingual Story Planning and Generation with Large Language Models

Evgeniia Razumovskaia , Joshua Maynez , Annie Louis , Mirella Lapata , Shashi Narayan

分类：自然语言处理

2022-12-20

We consider the problem of automatically generating stories in multiple languages. Compared to prior work in monolingual story generation, crosslingual story generation allows for more universal research on story planning. We propose to use Prompting Large Language Models with Plans to study which plan is optimal for story generation. We consider 4 types of plans and systematically analyse how the outputs differ for different planning strategies. The study demonstrates that formulating the plans as question-answer pairs leads to more coherent generated stories while the plan gives more control to the story creators.

translated by 谷歌翻译

DOC: Improving Long Story Coherence With Detailed Outline Control

Kevin Yang , Dan Klein , Nanyun Peng , Yuandong Tian

分类：自然语言处理 | 人工智能

2022-12-20

We propose the Detailed Outline Control (DOC) framework for improving long-range plot coherence when automatically generating several-thousand-word-long stories. DOC consists of two complementary components: a detailed outliner and a detailed controller. The detailed outliner creates a more detailed, hierarchically structured outline, shifting creative burden from the main drafting procedure to the planning stage. The detailed controller ensures the more detailed outline is still respected during generation by controlling story passages to align with outline details. In human evaluations of automatically generated stories, DOC substantially outperforms a strong Re3 baseline (Yang et al., 2022) on plot coherence (22.5% absolute gain), outline relevance (28.2%), and interestingness (20.7%). Humans also judged DOC to be much more controllable in an interactive generation setting.

translated by 谷歌翻译

Future Sight: Dynamic Story Generation with Large Pretrained Language Models

Brian D. Zimmerman , Gaurav Sahu , Olga Vechtomova

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-20

Recent advances in deep learning research, such as transformers, have bolstered the ability for automated agents to generate creative texts similar to those that a human would write. By default, transformer decoders can only generate new text with respect to previously generated text. The output distribution of candidate tokens at any position is conditioned on previously selected tokens using a self-attention mechanism to emulate the property of autoregression. This is inherently limiting for tasks such as controllable story generation where it may be necessary to condition on future plot events when writing a story. In this work, we propose Future Sight, a method for finetuning a pretrained generative transformer on the task of future conditioning. Transformer decoders are typically pretrained on the task of completing a context, one token at a time, by means of self-attention. Future Sight additionally enables a decoder to attend to an encoded future plot event. This motivates the decoder to expand on the context in a way that logically concludes with the provided future. During inference, the future plot event can be written by a human author to steer the narrative being generated in a certain direction. We evaluate the efficacy of our approach on a story generation task with human evaluators.

translated by 谷歌翻译

Neural Story Planning

Anbang Ye , Christopher Cui , Taiwei Shi , Mark O. Riedl

分类：自然语言处理 | 人工智能

2022-12-16

Automated plot generation is the challenge of generating a sequence of events that will be perceived by readers as the plot of a coherent story. Traditional symbolic planners plan a story from a goal state and guarantee logical causal plot coherence but rely on a library of hand-crafted actions with their preconditions and effects. This closed world setting limits the length and diversity of what symbolic planners can generate. On the other hand, pre-trained neural language models can generate stories with great diversity, while being generally incapable of ending a story in a specified manner and can have trouble maintaining coherence. In this paper, we present an approach to story plot generation that unifies causal planning with neural language models. We propose to use commonsense knowledge extracted from large language models to recursively expand a story plot in a backward chaining fashion. Specifically, our system infers the preconditions for events in the story and then events that will cause those conditions to become true. We performed automatic evaluation to measure narrative coherence as indicated by the ability to answer questions about whether different events in the story are causally related to other events. Results indicate that our proposed method produces more coherent plotlines than several strong baselines.

translated by 谷歌翻译