智能论文笔记

3D Structural Analysis of the Optic Nerve Head to Robustly Discriminate Between Papilledema and Optic Disc Drusen

Michaël J. A. Girard , Satish K. Panda , Tin Aung Tun , Elisabeth A. Wibroe , Raymond P. Najjar , Aung Tin , Alexandre H. Thiéry , Steffen Hamann , Clare Fraser , Dan Milea

分类：计算机视觉 | 机器学习

2021-12-18

目的：（1）开发深度学习算法，以识别3D光学相干断层扫描（OCT）扫描中的视神经头（ONH）的主要组织结构; （2）利用这些信息在健康，光盘博森（奇数）和乳头膜ONHS之间鲁棒地区分。由于高颅内压（51只眼）和健康对照（100只眼睛），这是一种横截面对比研究，由于高颅内压（51只眼睛），以及健康的对照（100只眼）。使用OCT获得ONH的3D扫描，然后加工以改善深层组织可见性。首先，使用984 B-Scans（从130只眼睛）开发了深度学习算法，以识别：主要的神经/结缔组织和奇数区域。使用骰子系数（DC）评估我们的算法的性能。在第2步骤中，使用1500Ct卷设计了一个分类算法（随机林），以严格从其德鲁森和普拉拉马那肿胀得分（来自细分）来执行3级分类（1：奇数，2：Papilledema，3：健康））。为了评估性能，我们报告了每个类的接收器操作特征曲线（AUC）下的区域。我们的分割算法能够在存在时隔离神经和结缔组织和奇数区域。这是在测试集上的平均DC为0.93 $ 0.03的平均直流，相应于良好性能。分类是用高AUC的分类，即检测奇数，0.99美元0.01 0.01美元，用于检测Papilledema的0.99美元，0.98美元$ 0.02用于检测健康的ONH。我们的AI方法可以使用单个OCT扫描来准确地歧视奇数乳头。我们的分类表现非常出色，有需要在更大的人口中验证。我们的方法可能有可能建立10月作为神经眼科诊断成像的主干。

translated by 谷歌翻译

MAUD: An Expert-Annotated Legal NLP Dataset for Merger Agreement Understanding

Steven H. Wang , Antoine Scardigli , Leonard Tang , Wei Chen , Dimitry Levkin , Anya Chen , Spencer Ball , Thomas Woodside , Oliver Zhang , Dan Hendrycks

分类：自然语言处理

2023-01-02

Reading comprehension of legal text can be a particularly challenging task due to the length and complexity of legal clauses and a shortage of expert-annotated datasets. To address this challenge, we introduce the Merger Agreement Understanding Dataset (MAUD), an expert-annotated reading comprehension dataset based on the American Bar Association's 2021 Public Target Deal Points Study, with over 39,000 examples and over 47,000 total annotations. Our fine-tuned Transformer baselines show promising results, with models performing well above random on most questions. However, on a large subset of questions, there is still room for significant improvement. As the only expert-annotated merger agreement dataset, MAUD is valuable as a benchmark for both the legal profession and the NLP community.

translated by 谷歌翻译

Massive Language Models Can Be Accurately Pruned in One-Shot

Elias Frantar , Dan Alistarh

分类：机器学习

2023-01-02

We show for the first time that large-scale generative pretrained transformer (GPT) family models can be pruned to at least 50% sparsity in one-shot, without any retraining, at minimal loss of accuracy. This is achieved via a new pruning method called SparseGPT, specifically designed to work efficiently and accurately on massive GPT-family models. When executing SparseGPT on the largest available open-source models, OPT-175B and BLOOM-176B, we can reach 60% sparsity with negligible increase in perplexity: remarkably, more than 100 billion weights from these models can be ignored at inference time. SparseGPT generalizes to semi-structured (2:4 and 4:8) patterns, and is compatible with weight quantization approaches.

translated by 谷歌翻译

Rethinking with Retrieval: Faithful Large Language Model Inference

Hangfeng He , Hongming Zhang , Dan Roth

分类：自然语言处理 | 人工智能

2022-12-31

Despite the success of large language models (LLMs) in various natural language processing (NLP) tasks, the stored knowledge in these models may inevitably be incomplete, out-of-date, or incorrect. This motivates the need to utilize external knowledge to assist LLMs. Unfortunately, current methods for incorporating external knowledge often require additional training or fine-tuning, which can be costly and may not be feasible for LLMs. To address this issue, we propose a novel post-processing approach, rethinking with retrieval (RR), which retrieves relevant external knowledge based on the decomposed reasoning steps obtained from the chain-of-thought (CoT) prompting. This lightweight approach does not require additional training or fine-tuning and is not limited by the input length of LLMs. We evaluate the effectiveness of RR through extensive experiments with GPT-3 on three complex reasoning tasks: commonsense reasoning, temporal reasoning, and tabular reasoning. Our results show that RR can produce more faithful explanations and improve the performance of LLMs.

translated by 谷歌翻译

Hyperspherical Quantization: Toward Smaller and More Accurate Models

Dan Liu , Xi Chen , Chen Ma , Xue Liu

分类：计算机视觉

2022-12-24

Model quantization enables the deployment of deep neural networks under resource-constrained devices. Vector quantization aims at reducing the model size by indexing model weights with full-precision embeddings, i.e., codewords, while the index needs to be restored to 32-bit during computation. Binary and other low-precision quantization methods can reduce the model size up to 32$\times$, however, at the cost of a considerable accuracy drop. In this paper, we propose an efficient framework for ternary quantization to produce smaller and more accurate compressed models. By integrating hyperspherical learning, pruning and reinitialization, our proposed Hyperspherical Quantization (HQ) method reduces the cosine distance between the full-precision and ternary weights, thus reducing the bias of the straight-through gradient estimator during ternary quantization. Compared with existing work at similar compression levels ($\sim$30$\times$, $\sim$40$\times$), our method significantly improves the test accuracy and reduces the model size.

translated by 谷歌翻译

Pruning On-the-Fly: A Recoverable Pruning Method without Fine-tuning

Dan Liu , Xue Liu

分类：计算机视觉

2022-12-24

Most existing pruning works are resource-intensive, requiring retraining or fine-tuning of the pruned models for accuracy. We propose a retraining-free pruning method based on hyperspherical learning and loss penalty terms. The proposed loss penalty term pushes some of the model weights far from zero, while the rest weight values are pushed near zero and can be safely pruned with no need for retraining and a negligible accuracy drop. In addition, our proposed method can instantly recover the accuracy of a pruned model by replacing the pruned values with their mean value. Our method obtains state-of-the-art results in retraining-free pruning and is evaluated on ResNet-18/50 and MobileNetV2 with ImageNet dataset. One can easily get a 50\% pruned ResNet18 model with a 0.47\% accuracy drop. With fine-tuning, the experiment results show that our method can significantly boost the accuracy of the pruned models compared with existing works. For example, the accuracy of a 70\% pruned (except the first convolutional layer) MobileNetV2 model only drops 3.5\%, much less than the 7\% $\sim$ 10\% accuracy drop with conventional methods.

translated by 谷歌翻译

Hyperspherical Loss-Aware Ternary Quantization

Dan Liu , Xue Liu

分类：计算机视觉 | 人工智能

2022-12-24

Most of the existing works use projection functions for ternary quantization in discrete space. Scaling factors and thresholds are used in some cases to improve the model accuracy. However, the gradients used for optimization are inaccurate and result in a notable accuracy gap between the full precision and ternary models. To get more accurate gradients, some works gradually increase the discrete portion of the full precision weights in the forward propagation pass, e.g., using temperature-based Sigmoid function. Instead of directly performing ternary quantization in discrete space, we push full precision weights close to ternary ones through regularization term prior to ternary quantization. In addition, inspired by the temperature-based method, we introduce a re-scaling factor to obtain more accurate gradients by simulating the derivatives of Sigmoid function. The experimental results show that our method can significantly improve the accuracy of ternary quantization in both image classification and object detection tasks.

translated by 谷歌翻译

Enhancing the prediction of disease outcomes using electronic health records and pretrained deep learning models

Zhichao Yang , Weisong Liu , Dan Berlowitz , Hong Yu

分类：人工智能 | 机器学习

2022-12-22

Question: Can an encoder-decoder architecture pretrained on a large dataset of longitudinal electronic health records improves patient outcome predictions? Findings: In this prognostic study of 6.8 million patients, our denoising sequence-to-sequence prediction model of multiple outcomes outperformed state-of-the-art models scuh pretrained BERT on a broad range of patient outcomes, including intentional self-harm and pancreatic cancer. Meaning: Deep bidirectional and autoregressive representation improves patient outcome prediction.

translated by 谷歌翻译

A Wearable Data Collection System for Studying Micro-Level E-Scooter Behavior in Naturalistic Road Environment

Avinash Prabu , Dan Shen , Renran Tian , Stanley Chien , Lingxi Li , Yaobin Chen , Rini Sherony

分类：计算机视觉

2022-12-22

As one of the most popular micro-mobility options, e-scooters are spreading in hundreds of big cities and college towns in the US and worldwide. In the meantime, e-scooters are also posing new challenges to traffic safety. In general, e-scooters are suggested to be ridden in bike lanes/sidewalks or share the road with cars at the maximum speed of about 15-20 mph, which is more flexible and much faster than the pedestrains and bicyclists. These features make e-scooters challenging for human drivers, pedestrians, vehicle active safety modules, and self-driving modules to see and interact. To study this new mobility option and address e-scooter riders' and other road users' safety concerns, this paper proposes a wearable data collection system for investigating the micro-level e-Scooter motion behavior in a Naturalistic road environment. An e-Scooter-based data acquisition system has been developed by integrating LiDAR, cameras, and GPS using the robot operating system (ROS). Software frameworks are developed to support hardware interfaces, sensor operation, sensor synchronization, and data saving. The integrated system can collect data continuously for hours, meeting all the requirements including calibration accuracy and capability of collecting the vehicle and e-Scooter encountering data.

translated by 谷歌翻译

GOOD: Exploring Geometric Cues for Detecting Objects in an Open World

Haiwen Huang , Andreas Geiger , Dan Zhang

分类：计算机视觉 | 机器学习

2022-12-22

We address the task of open-world class-agnostic object detection, i.e., detecting every object in an image by learning from a limited number of base object classes. State-of-the-art RGB-based models suffer from overfitting the training classes and often fail at detecting novel-looking objects. This is because RGB-based models primarily rely on appearance similarity to detect novel objects and are also prone to overfitting short-cut cues such as textures and discriminative parts. To address these shortcomings of RGB-based object detectors, we propose incorporating geometric cues such as depth and normals, predicted by general-purpose monocular estimators. Specifically, we use the geometric cues to train an object proposal network for pseudo-labeling unannotated novel objects in the training set. Our resulting Geometry-guided Open-world Object Detector (GOOD) significantly improves detection recall for novel object categories and already performs well with only a few training classes. Using a single "person" class for training on the COCO dataset, GOOD surpasses SOTA methods by 5.0% AR@100, a relative improvement of 24%.

translated by 谷歌翻译