智能论文笔记

Multi-Asset Closed-Loop Reservoir Management Using Deep Reinforcement Learning

Yusuf Nasir , Louis J. Durlofsky

分类：机器学习 | 人工智能

2022-07-21

闭环水库管理（CLRM）在资产的生命中多次进行历史匹配和生产优化，可以为指定目标提供显着改善。由于数据同化和优化所需的大量流量模拟，这些过程在计算上昂贵。现有的CLRM程序是通过资产应用的，而无需利用可能在范围资产中有用的信息。在这里，我们开发了一个CLRM框架，用于多个井数的多个资产。我们使用深度强化学习来培训适用于所有资产的单一全球控制政策。新框架是最近引入的单个资产控制政策方法的扩展。将嵌入层纳入表示形式，以处理针对不同资产出现的不同数量的决策变量。由于全球控制策略从多个资产中学习了有用功能的统一表示，因此构造比逐项培训要便宜（我们在示例中观察到大约3倍加速）。生产优化问题包括对井设置的相对变化约束，这使得适合实际使用的结果。我们将多资产的CLRM框架应用于2D和3D水浸水的示例。在这两种情况下，都考虑了四个具有不同井计数，井配置和地统计描述的资产。数值实验表明，全球控制策略为2D和3D案例提供了客观函数值，这些策略与每个资产单独培训的控制策略中几乎相同。这一有希望的发现表明，多资产的CLRM确实可能代表了可行的实践策略。

translated by 谷歌翻译

Floods Relevancy and Identification of Location from Twitter Posts using NLP Techniques

Muhammad Suleman , Muhammad Asif , Tayyab Zamir , Ayaz Mehmood , Jebran Khan , Nasir Ahmad , Kashif Ahmad

分类：自然语言处理

2023-01-01

This paper presents our solutions for the MediaEval 2022 task on DisasterMM. The task is composed of two subtasks, namely (i) Relevance Classification of Twitter Posts (RCTP), and (ii) Location Extraction from Twitter Texts (LETT). The RCTP subtask aims at differentiating flood-related and non-relevant social posts while LETT is a Named Entity Recognition (NER) task and aims at the extraction of location information from the text. For RCTP, we proposed four different solutions based on BERT, RoBERTa, Distil BERT, and ALBERT obtaining an F1-score of 0.7934, 0.7970, 0.7613, and 0.7924, respectively. For LETT, we used three models namely BERT, RoBERTa, and Distil BERTA obtaining an F1-score of 0.6256, 0.6744, and 0.6723, respectively.

translated by 谷歌翻译

StyleRes: Transforming the Residuals for Real Image Editing with StyleGAN

Hamza Pehlivan , Yusuf Dalva , Aysegul Dundar

分类：计算机视觉

2022-12-29

We present a novel image inversion framework and a training pipeline to achieve high-fidelity image inversion with high-quality attribute editing. Inverting real images into StyleGAN's latent space is an extensively studied problem, yet the trade-off between the image reconstruction fidelity and image editing quality remains an open challenge. The low-rate latent spaces are limited in their expressiveness power for high-fidelity reconstruction. On the other hand, high-rate latent spaces result in degradation in editing quality. In this work, to achieve high-fidelity inversion, we learn residual features in higher latent codes that lower latent codes were not able to encode. This enables preserving image details in reconstruction. To achieve high-quality editing, we learn how to transform the residual features for adapting to manipulations in latent codes. We train the framework to extract residual features and transform them via a novel architecture pipeline and cycle consistency losses. We run extensive experiments and compare our method with state-of-the-art inversion methods. Qualitative metrics and visual comparisons show significant improvements. Code: https://github.com/hamzapehlivan/StyleRes

translated by 谷歌翻译

SynCLay: Interactive Synthesis of Histology Images from Bespoke Cellular Layouts

Srijay Deshpande , Muhammad Dawood , Fayyaz Minhas , Nasir Rajpoot

分类：计算机视觉 | 机器学习

2022-12-28

Automated synthesis of histology images has several potential applications in computational pathology. However, no existing method can generate realistic tissue images with a bespoke cellular layout or user-defined histology parameters. In this work, we propose a novel framework called SynCLay (Synthesis from Cellular Layouts) that can construct realistic and high-quality histology images from user-defined cellular layouts along with annotated cellular boundaries. Tissue image generation based on bespoke cellular layouts through the proposed framework allows users to generate different histological patterns from arbitrary topological arrangement of different types of cells. SynCLay generated synthetic images can be helpful in studying the role of different types of cells present in the tumor microenvironmet. Additionally, they can assist in balancing the distribution of cellular counts in tissue images for designing accurate cellular composition predictors by minimizing the effects of data imbalance. We train SynCLay in an adversarial manner and integrate a nuclear segmentation and classification model in its training to refine nuclear structures and generate nuclear masks in conjunction with synthetic images. During inference, we combine the model with another parametric model for generating colon images and associated cellular counts as annotations given the grade of differentiation and cell densities of different cells. We assess the generated images quantitatively and report on feedback from trained pathologists who assigned realism scores to a set of images generated by the framework. The average realism score across all pathologists for synthetic images was as high as that for the real images. We also show that augmenting limited real data with the synthetic data generated by our framework can significantly boost prediction performance of the cellular composition prediction task.

translated by 谷歌翻译

VQA and Visual Reasoning: An Overview of Recent Datasets, Methods and Challenges

Rufai Yusuf Zakari , Jim Wilson Owusu , Hailin Wang , Ke Qin , Zaharaddeen Karami Lawal , Yuezhou Dong

分类：计算机视觉

2022-12-26

Artificial Intelligence (AI) and its applications have sparked extraordinary interest in recent years. This achievement can be ascribed in part to advances in AI subfields including Machine Learning (ML), Computer Vision (CV), and Natural Language Processing (NLP). Deep learning, a sub-field of machine learning that employs artificial neural network concepts, has enabled the most rapid growth in these domains. The integration of vision and language has sparked a lot of attention as a result of this. The tasks have been created in such a way that they properly exemplify the concepts of deep learning. In this review paper, we provide a thorough and an extensive review of the state of the arts approaches, key models design principles and discuss existing datasets, methods, their problem formulation and evaluation measures for VQA and Visual reasoning tasks to understand vision and language representation learning. We also present some potential future paths in this field of research, with the hope that our study may generate new ideas and novel approaches to handle existing difficulties and develop new applications.

translated by 谷歌翻译

A Dataless FaceSwap Detection Approach Using Synthetic Images

Anubhav Jain , Nasir Memon , Julian Togelius

分类：计算机视觉

2022-12-05

Face swapping technology used to create "Deepfakes" has advanced significantly over the past few years and now enables us to create realistic facial manipulations. Current deep learning algorithms to detect deepfakes have shown promising results, however, they require large amounts of training data, and as we show they are biased towards a particular ethnicity. We propose a deepfake detection methodology that eliminates the need for any real data by making use of synthetically generated data using StyleGAN3. This not only performs at par with the traditional training methodology of using real data but it shows better generalization capabilities when finetuned with a small amount of real data. Furthermore, this also reduces biases created by facial image datasets that might have sparse data from particular ethnicities.

translated by 谷歌翻译

Efficient Deep Reinforcement Learning with Predictive Processing Proximal Policy Optimization

Burcu Küçükoğlu , Walraaf Borkent , Bodo Rueckauer , Nasir Ahmad , Umut Güçlü , Marcel van Gerven

分类：机器学习 | 人工智能

2022-11-11

Advances in reinforcement learning (RL) often rely on massive compute resources and remain notoriously sample inefficient. In contrast, the human brain is able to efficiently learn effective control strategies using limited resources. This raises the question whether insights from neuroscience can be used to improve current RL methods. Predictive processing is a popular theoretical framework which maintains that the human brain is actively seeking to minimize surprise. We show that recurrent neural networks which predict their own sensory states can be leveraged to minimise surprise, yielding substantial gains in cumulative reward. Specifically, we present the Predictive Processing Proximal Policy Optimization (P4O) agent; an actor-critic reinforcement learning agent that applies predictive processing to a recurrent variant of the PPO algorithm by integrating a world model in its hidden state. P4O significantly outperforms a baseline recurrent variant of the PPO algorithm on multiple Atari games using a single GPU. It also outperforms other state-of-the-art agents given the same wall-clock time and exceeds human gamer performance on multiple games including Seaquest, which is a particularly challenging environment in the Atari domain. Altogether, our work underscores how insights from the field of neuroscience may support the development of more capable and efficient artificial agents.

translated by 谷歌翻译

TAP-Vid: A Benchmark for Tracking Any Point in a Video

Carl Doersch , Ankush Gupta , Larisa Markeeva , Adrià Recasens , Lucas Smaira , Yusuf Aytar , João Carreira , Andrew Zisserman , Yi Yang

分类：计算机视觉 | (统计)机器学习

2022-11-07

Generic motion understanding from video involves not only tracking objects, but also perceiving how their surfaces deform and move. This information is useful to make inferences about 3D shape, physical properties and object interactions. While the problem of tracking arbitrary physical points on surfaces over longer video clips has received some attention, no dataset or benchmark for evaluation existed, until now. In this paper, we first formalize the problem, naming it tracking any point (TAP). We introduce a companion benchmark, TAP-Vid, which is composed of both real-world videos with accurate human annotations of point tracks, and synthetic videos with perfect ground-truth point tracks. Central to the construction of our benchmark is a novel semi-automatic crowdsourced pipeline which uses optical flow estimates to compensate for easier, short-term motion like camera shake, allowing annotators to focus on harder sections of video. We validate our pipeline on synthetic data and propose a simple end-to-end point tracking model TAP-Net, showing that it outperforms all prior methods on our benchmark when trained on synthetic data.

translated by 谷歌翻译

Diversity and Novelty MasterPrints: Generating Multiple DeepMasterPrints for Increased User Coverage

M Charity , Nasir Memon , Zehua Jiang , Abhi Sen , Julian Togelius

分类：计算机视觉

2022-09-11

这项工作扩展了遗传指纹欺骗的先前进步，并引入了多样性和新颖的大师。该系统使用质量多样性进化算法来生成人造印刷的字典，重点是增加数据集对用户的覆盖范围。多样性大师图的重点是生成与以前发现的印刷品未涵盖的用户匹配的解决方案印刷品，而新颖的主版印刷明确地搜索了与以前的印刷品相比，在用户空间中更多的印刷品。我们的多印刷搜索方法在覆盖范围和概括方面都优于奇异的深层印刷，同时保持指纹图像输出的质量。

translated by 谷歌翻译

Multi-Scale Attention-based Multiple Instance Learning for Classification of Multi-Gigapixel Histology Images

Made Satria Wibawa , Kwok-Wai Lo , Lawrence Young , Nasir Rajpoot

分类：计算机视觉 | 人工智能 | 机器学习

2022-09-07

具有多吉吉像素的组织学图像产生了丰富的信息，以用于癌症诊断和预后。在大多数情况下，只能使用幻灯片级标签，因为像素的注释是劳动密集型任务。在本文中，我们提出了一条深度学习管道，以进行组织学图像中的分类。使用多个实例学习，我们试图预测基于降血石蛋白和曙红蛋白（H＆E）组织学图像的鼻咽癌（NPC）的潜在膜蛋白1（LMP1）状态。我们利用了与聚合层保持剩余连接的注意机制。在我们的3倍交叉验证实验中，我们分别达到了平均准确性，AUC和F1得分为0.936、0.995和0.862。这种方法还使我们能够通过可视化注意力评分来检查模型的可解释性。据我们所知，这是使用深度学习预测NPC上LMP1状态的首次尝试。

translated by 谷歌翻译