智能论文笔记

Convergence and Stability of the Stochastic Proximal Point Algorithm with Momentum

Junhyung Lyle Kim , Panos Toulis , Anastasios Kyrillidis

分类：机器学习

2021-11-11

随机梯度下降血液（SGDM）是许多优化方案中的主要算法，包括凸优化实例和非凸神经网络训练。然而，在随机设置中，动量会干扰梯度噪声，通常导致特定的台阶尺寸和动量选择，以便保证收敛，留出加速。另一方面，近端点方法由于其数值稳定性和针对不完美调谐的弹性而产生了很多关注。他们随机加速的变体虽然已接受有限的注意：动量与（随机）近端点的稳定性相互作用仍然在很大程度上是不孤立的。为了解决这个问题，我们专注于随机近端点算法的动量（SPPAM）的收敛性和稳定性，并显示SPPAM与随机近端点算法（SPPA）相比具有更好的收缩因子的更快的线性收敛速度，如适当的HyperParameter调整。在稳定性方面，我们表明SPPAM取决于问题常数比SGDM更有利，允许更广泛的步长和导致收敛的动量。

translated by 谷歌翻译

Invariant Inference via Residual Randomization

Panos Toulis

分类： (统计)机器学习

2019-08-12

统计推断中的主要范式取决于I.I.D.的结构。来自假设的无限人群的数据。尽管它取得了成功，但在复杂的数据结构下，即使在清楚无限人口所代表的内容的情况下，该框架在复杂的数据结构下仍然不灵活。在本文中，我们探讨了一个替代框架，在该框架中，推断只是对模型误差的不变性假设，例如交换性或符号对称性。作为解决这个不变推理问题的一般方法，我们提出了一个基于随机的过程。我们证明了该过程的渐近有效性的一般条件，并在许多数据结构中说明了，包括单向和双向布局中的群集误差。我们发现，通过残差随机化的不变推断具有三个吸引人的属性：（1）在弱且可解释的条件下是有效的，可以解决重型数据，有限聚类甚至一些高维设置的问题。（2）它在有限样品中是可靠的，因为它不依赖经典渐近学所需的规律性条件。（3）它以适应数据结构的统一方式解决了推断问题。另一方面，诸如OLS或Bootstrap之类的经典程序以I.I.D.为前提。结构，只要实际问题结构不同，就需要修改。经典框架中的这种不匹配导致了多种可靠的误差技术和自举变体，这些变体经常混淆应用研究。我们通过广泛的经验评估证实了这些发现。残留随机化对许多替代方案的表现有利，包括可靠的误差方法，自举变体和分层模型。

translated by 谷歌翻译

Spatially-resolved Thermometry from Line-of-Sight Emission Spectroscopy via Machine Learning

Ruiyuan Kang , Dimitrios C. Kyritsis , Panos Liatsis

分类：机器学习

2022-12-15

A methodology is proposed, which addresses the caveat that line-of-sight emission spectroscopy presents in that it cannot provide spatially resolved temperature measurements in nonhomogeneous temperature fields. The aim of this research is to explore the use of data-driven models in measuring temperature distributions in a spatially resolved manner using emission spectroscopy data. Two categories of data-driven methods are analyzed: (i) Feature engineering and classical machine learning algorithms, and (ii) end-to-end convolutional neural networks (CNN). In total, combinations of fifteen feature groups and fifteen classical machine learning models, and eleven CNN models are considered and their performances explored. The results indicate that the combination of feature engineering and machine learning provides better performance than the direct use of CNN. Notably, feature engineering which is comprised of physics-guided transformation, signal representation-based feature extraction and Principal Component Analysis is found to be the most effective. Moreover, it is shown that when using the extracted features, the ensemble-based, light blender learning model offers the best performance with RMSE, RE, RRMSE and R values of 64.3, 0.017, 0.025 and 0.994, respectively. The proposed method, based on feature engineering and the light blender model, is capable of measuring nonuniform temperature distributions from low-resolution spectra, even when the species concentration distribution in the gas mixtures is unknown.

translated by 谷歌翻译

ScanEnts3D: Exploiting Phrase-to-3D-Object Correspondences for Improved Visio-Linguistic Models in 3D Scenes

Ahmed Abdelreheem , Kyle Olszewski , Hsin-Ying Lee , Peter Wonka , Panos Achlioptas

分类：计算机视觉

2022-12-12

The two popular datasets ScanRefer [16] and ReferIt3D [3] connect natural language to real-world 3D data. In this paper, we curate a large-scale and complementary dataset extending both the aforementioned ones by associating all objects mentioned in a referential sentence to their underlying instances inside a 3D scene. Specifically, our Scan Entities in 3D (ScanEnts3D) dataset provides explicit correspondences between 369k objects across 84k natural referential sentences, covering 705 real-world scenes. Crucially, we show that by incorporating intuitive losses that enable learning from this novel dataset, we can significantly improve the performance of several recently introduced neural listening architectures, including improving the SoTA in both the Nr3D and ScanRefer benchmarks by 4.3% and 5.0%, respectively. Moreover, we experiment with competitive baselines and recent methods for the task of language generation and show that, as with neural listeners, 3D neural speakers can also noticeably benefit by training with ScanEnts3D, including improving the SoTA by 13.2 CIDEr points on the Nr3D benchmark. Overall, our carefully conducted experimental studies strongly support the conclusion that, by learning on ScanEnts3D, commonly used visio-linguistic 3D architectures can become more efficient and interpretable in their generalization without needing to provide these newly collected annotations at test time. The project's webpage is https://scanents3d.github.io/ .

translated by 谷歌翻译

LADIS: Language Disentanglement for 3D Shape Editing

Ian Huang , Panos Achlioptas , Tianyi Zhang , Sergey Tulyakov , Minhyuk Sung , Leonidas Guibas

分类：计算机视觉 | 自然语言处理

2022-12-09

Natural language interaction is a promising direction for democratizing 3D shape design. However, existing methods for text-driven 3D shape editing face challenges in producing decoupled, local edits to 3D shapes. We address this problem by learning disentangled latent representations that ground language in 3D geometry. To this end, we propose a complementary tool set including a novel network architecture, a disentanglement loss, and a new editing procedure. Additionally, to measure edit locality, we define a new metric that we call part-wise edit precision. We show that our method outperforms existing SOTA methods by 20% in terms of edit locality, and up to 6.6% in terms of language reference resolution accuracy. Our work suggests that by solely disentangling language representations, downstream 3D shape editing can become more local to relevant parts, even if the model was never given explicit part-based supervision.

translated by 谷歌翻译

Flame-state monitoring based on very low number of visible or infrared images via few-shot learning

Ruiyuan Kang , Panos Liatsis , Dimitrios C. Kyritsis

分类：计算机视觉

2022-10-14

The current success of machine learning on image-based combustion monitoring is based on massive data, which is costly even impossible for industrial applications. To address this conflict, we introduce few-shot learning in order to achieve combustion monitoring and classification for the first time. Two algorithms, Siamese Network coupled with k Nearest Neighbors (SN-kNN) and Prototypical Network (PN), were tested. Rather than utilizing solely visible images as discussed in previous studies, we also used Infrared (IR) images. We analyzed the training process, test performance and inference speed of two algorithms on both image formats, and also used t-SNE to visualize learned features. The results demonstrated that both SN-kNN and PN were capable to distinguish flame states from learning with merely 20 images per flame state. The worst performance, which was realized by PN on IR images, still possessed precision, accuracy, recall, and F1-score above 0.95. We showed that visible images demonstrated more substantial differences between classes and presented more consistent patterns inside the class, which made the training speed and model performance better compared to IR images. In contrast, the relatively low quality of IR images made it difficult for PN to extract distinguishable prototypes, which caused relatively weak performance. With the entrire training set supporting classification, SN-kNN performed well with IR images. On the other hand, benefitting from the architecture design, PN has a much faster speed in training and inference than SN-kNN. The presented work analyzed the characteristics of both algorithms and image formats for the first time, thus providing guidance for their future utilization in combustion monitoring tasks.

translated by 谷歌翻译

A Novel Transformer Network with Shifted Window Cross-Attention for Spatiotemporal Weather Forecasting

Alabi Bojesomo , Hasan Al Marzouqi , Panos Liatsis

分类：计算机视觉

2022-08-02

地球天文台是一个不断增长的研究领域，可以在短时间预测（即现在的情况下）利用AI的力量。在这项工作中，我们使用视频变压器网络应对天气预报的挑战。视觉变压器体系结构已在各种应用中进行了探索，主要限制是注意力的计算复杂性和饥饿的培训。为了解决这些问题，我们建议使用视频Swin-Transformer，再加上专用的增强计划。此外，我们在编码器侧采用逐渐的空间减少，并在解码器上进行了交叉注意。在Weather4cast2021天气预报挑战数据中测试了建议的方法，该数据需要从每小时的天气产品序列预测未来的8小时（每小时4个小时）。将数据集归一化为0-1，以促进使用不同数据集的评估指标。该模型在提供训练数据时会导致MSE得分为0.4750，在不使用培训数据的情况下转移学习过程中为0.4420。

translated by 谷歌翻译

Robust Contact State Estimation in Humanoid Walking Gaits

Stylianos Piperakis , Michael Maravgakis , Dimitrios Kanoulas , Panos Trahanias

分类：机器人 | 人工智能 | 机器学习

2022-07-30

在本文中，我们提出了一个深度学习框架，该框架为人形机器人步行步态中的腿部接触率检测提供了统一的方法。我们的配方实现了准确，稳健地估计每条腿的接触状态概率（即稳定或滑动/无接触）。所提出的框架采用了仅本体感知感应，尽管它依赖于模拟的基础真相接触数据进行分类过程，但我们证明了它在不同的摩擦表面和不同的腿部机器人平台上概括，同时也很容易地从模拟转移到模拟转移到实践。该框架是通过使用地面真实接触数据在模拟中进行定量和定性评估的，并与ATLA，NAO和TALOS类人类机器人的现状与ART方法形成对比。此外，用真实的talos人类生物生物估计得出了其功效。为了加强进一步的研究努力，我们的实施是作为开源的ROS/Python软件包，即创建的腿部接触检测（LCD）。

translated by 谷歌翻译

Digital Twin-based Intrusion Detection for Industrial Control Systems

Seba Anna Varghese , Alireza Dehlaghi Ghadim , Ali Balador , Zahra Alimadadi , Panos Papadimitratos

分类：机器学习

2022-07-20

数字双胞胎最近对工业控制系统（ICS）的模拟，优化和预测维护产生了重大兴趣。最近的研究讨论了在工业系统中使用数字双胞胎进行入侵检测的可能性。因此，这项研究为工业控制系统的基于数字双胞胎的安全框架做出了贡献，从而扩展了其模拟攻击和防御机制的能力。在独立的开源数字双胞胎上实施了四种类型的过程感知攻击方案：命令注入，网络拒绝服务（DOS），计算的测量修改和天真的测量修改。根据八种监督机器学习算法的离线评估，建议将堆叠的合奏分类器作为实时入侵检测。通过组合各种算法的预测，设计的堆叠模型就F1得分和准确性而言优于先前的方法，同时可以在接近实时（0.1秒）中检测和分类入侵。这项研究还讨论了拟议的基于数字双胞胎的安全框架的实用性和好处。

translated by 谷歌翻译

Data-driven initialization of deep learning solvers for Hamilton-Jacobi-Bellman PDEs

Anastasia Borovykh , Dante Kalise , Alexis Laignelet , Panos Parpas

分类： (统计)机器学习

2022-07-19

与非线性二次调节剂（NLQR）问题相关的汉密尔顿 - 雅各比 - 贝尔曼部分微分方程（HJB PDE）的近似的深度学习方法。首先使用了依赖于州的Riccati方程控制法来生成一个梯度调制的合成数据集，以进行监督学习。根据HJB PDE的残差，最小化损耗函数的最小化成为一个温暖的开始。监督学习和残留最小化的结合避免了虚假解决方案，并减轻了仅监督学习方法的数据效率低下。数值测试验证了所提出的方法的不同优势。

translated by 谷歌翻译