translated by 谷歌翻译
We consider the contextual bandit problem on general action and context spaces, where the learner's rewards depend on their selected actions and an observable context. This generalizes the standard multi-armed bandit to the case where side information is available, e.g., patients' records or customers' history, which allows for personalized treatment. We focus on consistency -- vanishing regret compared to the optimal policy -- and show that for large classes of non-i.i.d. contexts, consistency can be achieved regardless of the time-invariant reward mechanism, a property known as universal consistency. Precisely, we first give necessary and sufficient conditions on the context-generating process for universal consistency to be possible. Second, we show that there always exists an algorithm that guarantees universal consistency whenever this is achievable, called an optimistically universal learning rule. Interestingly, for finite action spaces, learnable processes for universal learning are exactly the same as in the full-feedback setting of supervised learning, previously studied in the literature. In other words, learning can be performed with partial feedback without any generalization cost. The algorithms balance a trade-off between generalization (similar to structural risk minimization) and personalization (tailoring actions to specific contexts). Lastly, we consider the case of added continuity assumptions on rewards and show that these lead to universal consistency for significantly larger classes of data-generating processes.
translated by 谷歌翻译
Athletes routinely undergo fitness evaluations to evaluate their training progress. Typically, these evaluations require a trained professional who utilizes specialized equipment like force plates. For the assessment, athletes perform drop and squat jumps, and key variables are measured, e.g. velocity, flight time, and time to stabilization, to name a few. However, amateur athletes may not have access to professionals or equipment that can provide these assessments. Here, we investigate the feasibility of estimating key variables using video recordings. We focus on jump velocity as a starting point because it is highly correlated with other key variables and is important for determining posture and lower-limb capacity. We find that velocity can be estimated with a high degree of precision across a range of athletes, with an average R-value of 0.71 (SD = 0.06).
translated by 谷歌翻译
Previous work has shown that a neural network with the rectified linear unit (ReLU) activation function leads to a convex polyhedral decomposition of the input space. These decompositions can be represented by a dual graph with vertices corresponding to polyhedra and edges corresponding to polyhedra sharing a facet, which is a subgraph of a Hamming graph. This paper illustrates how one can utilize the dual graph to detect and analyze adversarial attacks in the context of digital images. When an image passes through a network containing ReLU nodes, the firing or non-firing at a node can be encoded as a bit ($1$ for ReLU activation, $0$ for ReLU non-activation). The sequence of all bit activations identifies the image with a bit vector, which identifies it with a polyhedron in the decomposition and, in turn, identifies it with a vertex in the dual graph. We identify ReLU bits that are discriminators between non-adversarial and adversarial images and examine how well collections of these discriminators can ensemble vote to build an adversarial image detector. Specifically, we examine the similarities and differences of ReLU bit vectors for adversarial images, and their non-adversarial counterparts, using a pre-trained ResNet-50 architecture. While this paper focuses on adversarial digital images, ResNet-50 architecture, and the ReLU activation function, our methods extend to other network architectures, activation functions, and types of datasets.
translated by 谷歌翻译
图形卷积神经网络(GCNN)是材料科学中流行的深度学习模型(DL)模型,可从分子结构的图表中预测材料特性。训练针对分子设计的准确而全面的GCNN替代物需要大规模的图形数据集,并且通常是一个耗时的过程。 GPU和分布计算的最新进展为有效降低GCNN培训的计算成本开辟了道路。但是,高性能计算(HPC)资源进行培训的有效利用需要同时优化大型数据管理和可扩展的随机批处理优化技术。在这项工作中,我们专注于在HPC系统上构建GCNN模型,以预测数百万分子的材料特性。我们使用Hydragnn,我们的内部库进行大规模GCNN培训,利用Pytorch中的分布数据并行性。我们使用Adios(高性能数据管理框架)来有效存储和读取大分子图数据。我们在两个开源大规模图数据集上进行并行训练,以构建一个称为Homo-Lumo Gap的重要量子属性的GCNN预测指标。我们衡量在两个DOE超级计算机上的方法的可伸缩性,准确性和收敛性:橡树岭领导力计算设施(OLCF)的峰会超级计算机和国家能源研究科学计算中心(NERSC)的Perlmutter系统。我们通过HydragnN表示我们的实验结果,显示I)与常规方法相比,将数据加载时间降低了4.2倍,而II)线性缩放性能在峰会和Perlmutter上均可训练高达1,024 GPU。
translated by 谷歌翻译
translated by 谷歌翻译
translated by 谷歌翻译
通常,深度神经网络(DNN)修剪方法分为两类:1)基于重量的确定性约束和2)概率框架。虽然每种方法都有其优点和局限性,但有一系列常见的实际问题,例如试验和错误,以分析灵敏度和超参数来修剪DNN,这困扰着它们。在这项工作中,我们提出了一种新的单次,使用自适应连接分数(SNACS)称为减少神经网络的新型自动化算法。我们所提出的方法将概率修剪框架与基础重量矩阵的约束相结合,通过新的连接测量,在多个层面下,以利用两种方法的强度,同时解决它们的缺陷。在\ alg {}中,我们提出了一种基于自适应条件互信息(ACMI)的快速哈希估计器,它使用基于权重的缩放标准来评估过滤器和Preune不重要的连接之间的连接。为了自动确定可以修剪层的限制,我们提出了一组操作约束,该操作约束共同定义了深网络中所有层的上部修剪百分比限制。最后,我们为尺寸为测量其对后面的层的强度的滤波器来定义一个新颖的敏感性标准,并突出需要完全保护的严重过滤器。通过我们的实验验证,我们表明SNACS超过17倍最接近的可比方法,并且是跨三个标准数据集-DNN修剪基准测试的最新的单次修剪方法:CIFAR10-VGG16,CIFAR10-RESET56和ILSVRC2012-RENET50。
translated by 谷歌翻译