智能论文笔记

Document Image Binarization in JPEG Compressed Domain using Dual Discriminator Generative Adversarial Networks

Bulla Rajesh , Manav Kamlesh Agrawal , Milan Bhuva , Kisalaya Kishore , Mohammed Javed

分类：计算机视觉 | 人工智能 | 机器学习

2022-09-13

图像二进制技术通常用于增强嘈杂和/或退化的图像来迎合不同文档图像Anlaysis（DIA）应用（如单词斑点，文档检索和OCR）。大多数现有技术都集中在将像素图像馈送到卷积神经网络中以完成文档二进制化，这在使用不完全减压的情况下需要处理的压缩图像时可能不会产生有效的结果。因此，在本研究论文中，通过使用双重鉴别器生成对抗网络（DD-GAN），提出了使用JPEG压缩图像的文档图像二进制的想法。在这里，两个歧视者网络 - 全球和本地工作在不同的图像比率上，并将焦点损失用作发电机损失。提出的模型已通过不同版本的DIBCO数据集进行了彻底的测试，该数据集具有诸如孔，擦除或弄脏的墨水，灰尘和放错地方的挑战。在时间和空间复杂性方面，该模型被证明是高度鲁棒，有效的，并且还导致了JPEG压缩域中的最新性能。

translated by 谷歌翻译

Beyond Information Exchange: An Approach to Deploy Network Properties for Information Diffusion

Soumita Das , Anupam Biswas , Ravi Kishore Devarapalli

分类：计算机视觉

2022-12-21

Information diffusion in Online Social Networks is a new and crucial problem in social network analysis field and requires significant research attention. Efficient diffusion of information are of critical importance in diverse situations such as; pandemic prevention, advertising, marketing etc. Although several mathematical models have been developed till date, but previous works lacked systematic analysis and exploration of the influence of neighborhood for information diffusion. In this paper, we have proposed Common Neighborhood Strategy (CNS) algorithm for information diffusion that demonstrates the role of common neighborhood in information propagation throughout the network. The performance of CNS algorithm is evaluated on several real-world datasets in terms of diffusion speed and diffusion outspread and compared with several widely used information diffusion models. Empirical results show CNS algorithm enables better information diffusion both in terms of diffusion speed and diffusion outspread.

translated by 谷歌翻译

Latent Diffusion for Language Generation

Justin Lovelace , Varsha Kishore , Chao Wan , Eliot Shekhtman , Kilian Weinberger

分类：自然语言处理 | 机器学习

2022-12-19

Diffusion models have achieved great success in modeling continuous data modalities such as images, audio, and video, but have seen limited use in discrete domains such as language. Recent attempts to adapt diffusion to language have presented diffusion as an alternative to autoregressive language generation. We instead view diffusion as a complementary method that can augment the generative capabilities of existing pre-trained language models. We demonstrate that continuous diffusion models can be learned in the latent space of a pre-trained encoder-decoder model, enabling us to sample continuous latent representations that can be decoded into natural language with the pre-trained decoder. We show that our latent diffusion models are more effective at sampling novel text from data distributions than a strong autoregressive baseline and also enable controllable generation.

translated by 谷歌翻译

Automating Vascular Shunt Insertion with the dVRK Surgical Robot

Karthik Dharmarajan , Will Panitch , Muyan Jiang , Kishore Srinivas , Baiyu Shi , Yahav Avigal , Huang Huang , Thomas Low , Danyal Fer , Ken Goldberg

分类：机器人

2022-11-04

Vascular shunt insertion is a fundamental surgical procedure used to temporarily restore blood flow to tissues. It is often performed in the field after major trauma. We formulate a problem of automated vascular shunt insertion and propose a pipeline to perform Automated Vascular Shunt Insertion (AVSI) using a da Vinci Research Kit. The pipeline uses a learned visual model to estimate the locus of the vessel rim, plans a grasp on the rim, and moves to grasp at that point. The first robot gripper then pulls the rim to stretch open the vessel with a dilation motion. The second robot gripper then proceeds to insert a shunt into the vessel phantom (a model of the blood vessel) with a chamfer tilt followed by a screw motion. Results suggest that AVSI achieves a high success rate even with tight tolerances and varying vessel orientations up to 30{\deg}. Supplementary material, dataset, videos, and visualizations can be found at https://sites.google.com/berkeley.edu/autolab-avsi.

translated by 谷歌翻译

Caching Contents with Varying Popularity using Restless Bandits

Pavamana K J , Chandramani Kishore Singh

分类：人工智能

2022-10-31

Mobile networks are experiencing prodigious increase in data volume and user density , which exerts a great burden on mobile core networks and backhaul links. An efficient technique to lessen this problem is to use caching i.e. to bring the data closer to the users by making use of the caches of edge network nodes, such as fixed or mobile access points and even user devices. The performance of a caching depends on contents that are cached. In this paper, we examine the problem of content caching at the wireless edge(i.e. base stations) to minimize the discounted cost incurred over infinite horizon. We formulate this problem as a restless bandit problem, which is hard to solve. We begin by showing an optimal policy is of threshold type. Using these structural results, we prove the indexability of the problem, and use Whittle index policy to minimize the discounted cost.

translated by 谷歌翻译

Overview of the HASOC Subtrack at FIRE 2021: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages

Thomas Mandl , Sandip Modha , Gautam Kishore Shahi , Hiren Madhu , Shrey Satapara , Prasenjit Majumder , Johannes Schaefer , Tharindu Ranasinghe , Marcos Zampieri , Durgesh Nandini

分类：自然语言处理 | 人工智能

2021-12-17

仇恨言论等攻击性内容的广泛构成了越来越多的社会问题。 AI工具是支持在线平台的审核过程所必需的。为了评估这些识别工具，需要与不同语言的数据集进行连续实验。 HASOC轨道（仇恨语音和冒犯性内容识别）专用于为此目的开发基准数据。本文介绍了英语，印地语和马拉地赛的Hasoc Subtrack。数据集由Twitter组装。此子系统有两个子任务。任务A是为所有三种语言提供的二进制分类问题（仇恨而非冒犯）。任务B是三个课程（仇恨）仇恨言论，令人攻击和亵渎为英语和印地语提供的细粒度分类问题。总体而言，652名队伍提交了652次。任务A最佳分类算法的性能分别为Marathi，印地语和英语的0.91,0.78和0.83尺寸。此概述介绍了任务和数据开发以及详细结果。提交竞争的系统应用了各种技术。最好的表演算法主要是变压器架构的变种。

translated by 谷歌翻译

Lizard: A Large-Scale Dataset for Colonic Nuclear Instance Segmentation and Classification

Simon Graham , Mostafa Jahanifar , Ayesha Azam , Mohammed Nimir , Yee-Wah Tsang , Katherine Dodd , Emily Hero , Harvir Sahota , Atisha Tank , Ksenija Benes

分类：计算机视觉 | 机器学习

2021-08-25

用于计算病理（CPATH）的深度分割模型的发展可以帮助培养可解释的形态生物标志物的调查。然而，这些方法的成功存在主要瓶颈，因为监督的深度学习模型需要丰富的准确标记数据。该问题在CPATH领域加剧，因为详细注释的产生通常需要对病理学家的输入能够区分不同的组织构建体和核。手动标记核可能不是收集大规模注释数据集的可行方法，特别是当单个图像区域可以包含数千个不同的单元时。但是，仅依靠自动生成注释将限制地面真理的准确性和可靠性。因此，为了帮助克服上述挑战，我们提出了一种多级注释管道，以使大规模数据集进行用于组织学图像分析，具有病理学家in-循环的细化步骤。使用本市管道，我们生成最大的已知核实例分段和分类数据集，其中包含近百万分之一的H＆E染色的结肠组织中标记的细胞核。我们发布了DataSet并鼓励研究社区利用它来推动CPATH中下游小区模型的发展。

translated by 谷歌翻译

BERTScore: Evaluating Text Generation with BERT

Tianyi Zhang , Varsha Kishore , Felix Wu , Kilian Q. Weinberger , Yoav Artzi

分类：

2019-04-21

We propose BERTSCORE, an automatic evaluation metric for text generation. Analogously to common metrics, BERTSCORE computes a similarity score for each token in the candidate sentence with each token in the reference sentence. However, instead of exact matches, we compute token similarity using contextual embeddings. We evaluate using the outputs of 363 machine translation and image captioning systems. BERTSCORE correlates better with human judgments and provides stronger model selection performance than existing metrics. Finally, we use an adversarial paraphrase detection task to show that BERTSCORE is more robust to challenging examples when compared to existing metrics.

translated by 谷歌翻译