自从深网的成立以来,训练模型所需的计算资源一直在增加。大规模数据集中的培训神经网络已成为一项具有挑战性且耗时的任务。因此,需要减少数据集而不损害准确性。在本文中,我们介绍了一种早期方法,即通过均匀聚类来减少数据集大小的新颖方法。所提出的方法基于将数据集划分为均匀簇的想法,并选择对准确性产生显着贡献的图像。我们提出了两种变体:用于图像数据降低的几何均匀聚类(GHCIDR)和合并GHCIDR在基线算法 - 通过均匀聚类(RHC)降低(RHC),以实现更好的准确性和训练时间。 GHCIDR背后的直觉涉及通过簇权重和训练集的几何分布选择数据点。合并GHCIDR涉及使用完整的链接聚类的群集合并相同的标签。我们使用了三个深度学习模型 - 完全连接的网络(FCN),VGG1和VGG16。我们在四个数据集中进行了两个变体 - MNIST,CIFAR10,Fashion-Mnist和Tiny-Imagenet。与RHC相同百分比的合并GHCIDR在MNIST,Fashion-Mnist,CIFAR10和Tiny-Imagenet上分别增加了2.8%,8.9%,7.6%和3.5%。
translated by 谷歌翻译
在本文中,我们介绍了一种早期方法的新颖变化,称为均质聚类算法,用于降低数据集大小。本文提出的方法背后的直觉是将数据集划分为均匀簇,并选择一些对准确性产生重大贡献的图像。选定的图像是训练数据的正确子集,因此是可读的。我们在基线算法RHC上提出了四个变体。第一种方法背后的直觉是,边界点有助于簇的代表。它涉及选择群集质心的最远的k和一个最近的邻居。在以下两种方法(KONCW和CWKC)中,我们介绍了簇权重的概念。它们是基于这样一个事实,即较大的簇贡献比较小的群集的贡献更多。最终变化是GHCIDR,它根据数据分布的几何方面选择点。我们在两个深度学习模型 - 完全连接的网络(FCN)和VGG1上进行了实验。我们在三个数据集中的四个变体中进行了实验:MNIST,CIFAR10和Fashion-Mnist。我们发现,GHCIDR的最佳准确度分别为99.35%,81.10%和91.66%,培训数据降低了87.27%,32.34%和76.80%,分别为MNIST,CIFAR10和时尚。
translated by 谷歌翻译
The rapid development of remote sensing technologies have gained significant attention due to their ability to accurately localize, classify, and segment objects from aerial images. These technologies are commonly used in unmanned aerial vehicles (UAVs) equipped with high-resolution cameras or sensors to capture data over large areas. This data is useful for various applications, such as monitoring and inspecting cities, towns, and terrains. In this paper, we presented a method for classifying and segmenting city road traffic dashed lines from aerial images using deep learning models such as U-Net and SegNet. The annotated data is used to train these models, which are then used to classify and segment the aerial image into two classes: dashed lines and non-dashed lines. However, the deep learning model may not be able to identify all dashed lines due to poor painting or occlusion by trees or shadows. To address this issue, we proposed a method to add missed lines to the segmentation output. We also extracted the x and y coordinates of each dashed line from the segmentation output, which can be used by city planners to construct a CAD file for digital visualization of the roads.
translated by 谷歌翻译
As information extraction (IE) systems have grown more capable at whole-document extraction, the classic task of \emph{template filling} has seen renewed interest as a benchmark for evaluating them. In this position paper, we call into question the suitability of template filling for this purpose. We argue that the task demands definitive answers to thorny questions of \emph{event individuation} -- the problem of distinguishing distinct events -- about which even human experts disagree. We show through annotation studies and error analysis that this raises concerns about the usefulness of template filling evaluation metrics, the quality of datasets for the task, and the ability of models to learn it. Finally, we consider possible solutions.
translated by 谷歌翻译
Bike sharing systems often suffer from poor capacity management as a result of variable demand. These bike sharing systems would benefit from models to predict demand in order to moderate the number of bikes stored at each station. In this paper, we attempt to apply a graph neural network model to predict bike demand in the New York City, Citi Bike dataset.
translated by 谷歌翻译
This paper proposes an easy-to-compute upper bound for the overlap index between two probability distributions without requiring any knowledge of the distribution models. The computation of our bound is time-efficient and memory-efficient and only requires finite samples. The proposed bound shows its value in one-class classification and domain shift analysis. Specifically, in one-class classification, we build a novel one-class classifier by converting the bound into a confidence score function. Unlike most one-class classifiers, the training process is not needed for our classifier. Additionally, the experimental results show that our classifier \textcolor{\colorname}{can be accurate with} only a small number of in-class samples and outperforms many state-of-the-art methods on various datasets in different one-class classification scenarios. In domain shift analysis, we propose a theorem based on our bound. The theorem is useful in detecting the existence of domain shift and inferring data information. The detection and inference processes are both computation-efficient and memory-efficient. Our work shows significant promise toward broadening the applications of overlap-based metrics.
translated by 谷歌翻译
We propose a framework in which multiple entities collaborate to build a machine learning model while preserving privacy of their data. The approach utilizes feature embeddings from shared/per-entity feature extractors transforming data into a feature space for cooperation between entities. We propose two specific methods and compare them with a baseline method. In Shared Feature Extractor (SFE) Learning, the entities use a shared feature extractor to compute feature embeddings of samples. In Locally Trained Feature Extractor (LTFE) Learning, each entity uses a separate feature extractor and models are trained using concatenated features from all entities. As a baseline, in Cooperatively Trained Feature Extractor (CTFE) Learning, the entities train models by sharing raw data. Secure multi-party algorithms are utilized to train models without revealing data or features in plain text. We investigate the trade-offs among SFE, LTFE, and CTFE in regard to performance, privacy leakage (using an off-the-shelf membership inference attack), and computational cost. LTFE provides the most privacy, followed by SFE, and then CTFE. Computational cost is lowest for SFE and the relative speed of CTFE and LTFE depends on network architecture. CTFE and LTFE provide the best accuracy. We use MNIST, a synthetic dataset, and a credit card fraud detection dataset for evaluations.
translated by 谷歌翻译
Masked Language Modeling (MLM) has proven to be an essential component of Vision-Language (VL) pretraining. To implement MLM, the researcher must make two design choices: the masking strategy, which determines which tokens to mask, and the masking rate, which determines how many tokens to mask. Previous work has focused primarily on the masking strategy while setting the masking rate at a default of 15\%. In this paper, we show that increasing this masking rate improves downstream performance while simultaneously reducing performance gap among different masking strategies, rendering the uniform masking strategy competitive to other more complex ones. Surprisingly, we also discover that increasing the masking rate leads to gains in Image-Text Matching (ITM) tasks, suggesting that the role of MLM goes beyond language modeling in VL pretraining.
translated by 谷歌翻译
Predicting the physical interaction of proteins is a cornerstone problem in computational biology. New classes of learning-based algorithms are actively being developed, and are typically trained end-to-end on protein complex structures extracted from the Protein Data Bank. These training datasets tend to be large and difficult to use for prototyping and, unlike image or natural language datasets, they are not easily interpretable by non-experts. We present Dock2D-IP and Dock2D-IF, two "toy" datasets that can be used to select algorithms predicting protein-protein interactions$\unicode{x2014}$or any other type of molecular interactions. Using two-dimensional shapes as input, each example from Dock2D-IP ("interaction pose") describes the interaction pose of two shapes known to interact and each example from Dock2D-IF ("interaction fact") describes whether two shapes form a stable complex or not. We propose a number of baseline solutions to the problem and show that the same underlying energy function can be learned either by solving the interaction pose task (formulated as an energy-minimization "docking" problem) or the fact-of-interaction task (formulated as a binding free energy estimation problem).
translated by 谷歌翻译
This paper presents a new approach for analyzing and identifying potentially useful generalized plans. It presents a new conceptual framework along with an algorithmic process for assessing termination and reachability related properties of generalized plans. The presented framework builds upon classic results on the analysis of graphs to decompose generalized plans into smaller components in a novel algorithm for conducting a hierarchical analysis for termination of arbitrary generalized plans. Theoretical analysis of the new framework establishes soundness of the presented algorithms and shows how it goes beyond existing approaches; empirical analysis illustrates the scope of this approach. Our analysis shows that this new approach can effectively identify termination for a significantly larger class of generalized plans than was possible using existing methods.
translated by 谷歌翻译