智能论文笔记

Pragmatic Implementation of Reinforcement Algorithms For Path Finding On Raspberry Pi

Serena Raju , Sherin Shibu , Riya Mol Raji , Joel Thomas

分类：机器人 | 人工智能

2021-12-07

在本文中，审计了利用了用于路径规划和碰撞避免的强化学习算法的室内自主输送系统的语用实现。所提出的系统是一种成本效益的方法，用于促进覆盆子PI控制的四轮驱动非正度机器人映射网格。该方法计算并导航从源键点到目的地键点的最短路径以执行所需的递送。Q学习和深度学习用于找到最佳路径，同时避免与静态障碍物碰撞。这项工作定义了一种在机器人上部署这两个算法的方法。还提出了一种新颖的算法，将方向阵列解码为特定动作空间中的准确运动。遵循将该系统分发的程序，描述了我的要求，呈现我们对室内自治送货车的概念证明。

translated by 谷歌翻译

NeMo: 3D Neural Motion Fields from Multiple Video Instances of the Same Action

Kuan-Chieh Wang , Zhenzhen Weng , Maria Xenochristou , Joao Pedro Araujo , Jeffrey Gu , C. Karen Liu , Serena Yeung

分类：计算机视觉

2022-12-28

The task of reconstructing 3D human motion has wideranging applications. The gold standard Motion capture (MoCap) systems are accurate but inaccessible to the general public due to their cost, hardware and space constraints. In contrast, monocular human mesh recovery (HMR) methods are much more accessible than MoCap as they take single-view videos as inputs. Replacing the multi-view Mo- Cap systems with a monocular HMR method would break the current barriers to collecting accurate 3D motion thus making exciting applications like motion analysis and motiondriven animation accessible to the general public. However, performance of existing HMR methods degrade when the video contains challenging and dynamic motion that is not in existing MoCap datasets used for training. This reduces its appeal as dynamic motion is frequently the target in 3D motion recovery in the aforementioned applications. Our study aims to bridge the gap between monocular HMR and multi-view MoCap systems by leveraging information shared across multiple video instances of the same action. We introduce the Neural Motion (NeMo) field. It is optimized to represent the underlying 3D motions across a set of videos of the same action. Empirically, we show that NeMo can recover 3D motion in sports using videos from the Penn Action dataset, where NeMo outperforms existing HMR methods in terms of 2D keypoint detection. To further validate NeMo using 3D metrics, we collected a small MoCap dataset mimicking actions in Penn Action,and show that NeMo achieves better 3D reconstruction compared to various baselines.

translated by 谷歌翻译

Autonomous Vehicle Navigation with LIDAR using Path Planning

Rahul M K , Sumukh B , Praveen L Uppunda , Vinayaka Raju , C Gururaj

分类：机器人

2022-12-14

In this paper, a complete framework for Autonomous Self Driving is implemented. LIDAR, Camera and IMU sensors are used together. The entire data communication is managed using Robot Operating System which provides a robust platform for implementation of Robotics Projects. Jetson Nano is used to provide powerful on-board processing capabilities. Sensor fusion is performed on the data received from the different sensors to improve the accuracy of the decision making and inferences that we derive from the data. This data is then used to create a localized map of the environment. In this step, the position of the vehicle is obtained with respect to the Mapping done using the sensor data.The different SLAM techniques used for this purpose are Hector Mapping and GMapping which are widely used mapping techniques in ROS. Apart from SLAM that primarily uses LIDAR data, Visual Odometry is implemented using a Monocular Camera. The sensor fused data is then used by Adaptive Monte Carlo Localization for car localization. Using the localized map developed, Path Planning techniques like "TEB planner" and "Dynamic Window Approach" are implemented for autonomous navigation of the vehicle. The last step in the Project is the implantation of Control which is the final decision making block in the pipeline that gives speed and steering data for the navigation that is compatible with Ackermann Kinematics. The implementation of such a control block under a ROS framework using the three sensors, viz, LIDAR, Camera and IMU is a novel approach that is undertaken in this project.

translated by 谷歌翻译

DIAMOND: Taming Sample and Communication Complexities in Decentralized Bilevel Optimization

Peiwen Qiu , Yining Li , Zhuqing Liu , Prashant Khanduri , Jia Liu , Ness B. Shroff , Elizabeth Serena Bentley , Kurt Turck

分类：机器学习

2022-12-05

Decentralized bilevel optimization has received increasing attention recently due to its foundational role in many emerging multi-agent learning paradigms (e.g., multi-agent meta-learning and multi-agent reinforcement learning) over peer-to-peer edge networks. However, to work with the limited computation and communication capabilities of edge networks, a major challenge in developing decentralized bilevel optimization techniques is to lower sample and communication complexities. This motivates us to develop a new decentralized bilevel optimization called DIAMOND (decentralized single-timescale stochastic approximation with momentum and gradient-tracking). The contributions of this paper are as follows: i) our DIAMOND algorithm adopts a single-loop structure rather than following the natural double-loop structure of bilevel optimization, which offers low computation and implementation complexity; ii) compared to existing approaches, the DIAMOND algorithm does not require any full gradient evaluations, which further reduces both sample and computational complexities; iii) through a careful integration of momentum information and gradient tracking techniques, we show that the DIAMOND algorithm enjoys $\mathcal{O}(\epsilon^{-3/2})$ in sample and communication complexities for achieving an $\epsilon$-stationary solution, both of which are independent of the dataset sizes and significantly outperform existing works. Extensive experiments also verify our theoretical findings.

translated by 谷歌翻译

Persona-Based Conversational AI: State of the Art and Challenges

Junfeng Liu , Christopher Symons , Ranga Raju Vatsavai

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-04

Conversational AI has become an increasingly prominent and practical application of machine learning. However, existing conversational AI techniques still suffer from various limitations. One such limitation is a lack of well-developed methods for incorporating auxiliary information that could help a model understand conversational context better. In this paper, we explore how persona-based information could help improve the quality of response generation in conversations. First, we provide a literature review focusing on the current state-of-the-art methods that utilize persona information. We evaluate two strong baseline methods, the Ranking Profile Memory Network and the Poly-Encoder, on the NeurIPS ConvAI2 benchmark dataset. Our analysis elucidates the importance of incorporating persona information into conversational systems. Additionally, our study highlights several limitations with current state-of-the-art methods and outlines challenges and future research directions for advancing personalized conversational AI technology.

translated by 谷歌翻译

PROB: Probabilistic Objectness for Open World Object Detection

Orr Zohar , Kuan-Chieh Wang , Serena Yeung

分类：计算机视觉 | 人工智能 | 机器学习

2022-12-02

Open World Object Detection (OWOD) is a new and challenging computer vision task that bridges the gap between classic object detection (OD) benchmarks and object detection in the real world. In addition to detecting and classifying seen/labeled objects, OWOD algorithms are expected to detect novel/unknown objects - which can be classified and incrementally learned. In standard OD, object proposals not overlapping with a labeled object are automatically classified as background. Therefore, simply applying OD methods to OWOD fails as unknown objects would be predicted as background. The challenge of detecting unknown objects stems from the lack of supervision in distinguishing unknown objects and background object proposals. Previous OWOD methods have attempted to overcome this issue by generating supervision using pseudo-labeling - however, unknown object detection has remained low. Probabilistic/generative models may provide a solution for this challenge. Herein, we introduce a novel probabilistic framework for objectness estimation, where we alternate between probability distribution estimation and objectness likelihood maximization of known objects in the embedded feature space - ultimately allowing us to estimate the objectness probability of different proposals. The resulting Probabilistic Objectness transformer-based open-world detector, PROB, integrates our framework into traditional object detection models, adapting them for the open-world setting. Comprehensive experiments on OWOD benchmarks show that PROB outperforms all existing OWOD methods in both unknown object detection ($\sim 2\times$ unknown recall) and known object detection ($\sim 10\%$ mAP). Our code will be made available upon publication at https://github.com/orrzohar/PROB.

translated by 谷歌翻译

Quantum median filter for Total Variation image denoising

Simone De Santis , Damiana Lazzaro , Riccardo Mengoni , Serena Morigi

分类：计算机视觉

2022-12-02

In this new computing paradigm, named quantum computing, researchers from all over the world are taking their first steps in designing quantum circuits for image processing, through a difficult process of knowledge transfer. This effort is named Quantum Image Processing, an emerging research field pushed by powerful parallel computing capabilities of quantum computers. This work goes in this direction and proposes the challenging development of a powerful method of image denoising, such as the Total Variation (TV) model, in a quantum environment. The proposed Quantum TV is described and its sub-components are analysed. Despite the natural limitations of the current capabilities of quantum devices, the experimental results show a competitive denoising performance compared to the classical variational TV counterpart.

translated by 谷歌翻译

Cementron: Machine Learning the Constituent Phases in Cement Clinker from Optical Images

Mohd Zaki , Siddhant Sharma , Sunil Kumar Gurjar , Raju Goyal , Jayadeva , N. M. Anoop Krishnan

分类：计算机视觉

2022-11-06

Cement is the most used construction material. The performance of cement hydrate depends on the constituent phases, viz. alite, belite, aluminate, and ferrites present in the cement clinker, both qualitatively and quantitatively. Traditionally, clinker phases are analyzed from optical images relying on a domain expert and simple image processing techniques. However, the non-uniformity of the images, variations in the geometry and size of the phases, and variabilities in the experimental approaches and imaging methods make it challenging to obtain the phases. Here, we present a machine learning (ML) approach to detect clinker microstructure phases automatically. To this extent, we create the first annotated dataset of cement clinker by segmenting alite and belite particles. Further, we use supervised ML methods to train models for identifying alite and belite regions. Specifically, we finetune the image detection and segmentation model Detectron-2 on the cement microstructure to develop a model for detecting the cement phases, namely, Cementron. We demonstrate that Cementron, trained only on literature data, works remarkably well on new images obtained from our experiments, demonstrating its generalizability. We make Cementron available for public use.

translated by 谷歌翻译

Applying Association Rules Mining to Investigate Pedestrian Fatal and Injury Crash Patterns Under Different Lighting Conditions

Ahmed Hossain , Xiaoduan Sun , Raju Thapa , Julius Codjoe

分类： (统计)机器学习 | 机器学习

2022-11-06

The pattern of pedestrian crashes varies greatly depending on lighting circumstances, emphasizing the need of examining pedestrian crashes in various lighting conditions. Using Louisiana pedestrian fatal and injury crash data (2010-2019), this study applied Association Rules Mining (ARM) to identify the hidden pattern of crash risk factors according to three different lighting conditions (daylight, dark-with-streetlight, and dark-no-streetlight). Based on the generated rules, the results show that daylight pedestrian crashes are associated with children (less than 15 years), senior pedestrians (greater than 64 years), older drivers (>64 years), and other driving behaviors such as failure to yield, inattentive/distracted, illness/fatigue/asleep. Additionally, young drivers (15-24 years) are involved in severe pedestrian crashes in daylight conditions. This study also found pedestrian alcohol/drug involvement as the most frequent item in the dark-with-streetlight condition. This crash type is particularly associated with pedestrian action (crossing intersection/midblock), driver age (55-64 years), speed limit (30-35 mph), and specific area type (business with mixed residential area). Fatal pedestrian crashes are found to be associated with roadways with high-speed limits (>50 mph) during the dark without streetlight condition. Some other risk factors linked with high-speed limit related crashes are pedestrians walking with/against the traffic, presence of pedestrian dark clothing, pedestrian alcohol/drug involvement. The research findings are expected to provide an improved understanding of the underlying relationships between pedestrian crash risk factors and specific lighting conditions. Highway safety experts can utilize these findings to conduct a decision-making process for selecting effective countermeasures to reduce pedestrian crashes strategically.

translated by 谷歌翻译

Lost in Translation: Reimagining the Machine Learning Life Cycle in Education

Lydia T. Liu , Serena Wang , Tolani Britton , Rediet Abebe

分类：人工智能

2022-09-08

机器学习（ML）技术在教育方面越来越普遍，从预测学生辍学，到协助大学入学以及促进MOOC的兴起。考虑到这些新颖用途的快速增长，迫切需要调查ML技术如何支持长期以来的教育原则和目标。在这项工作中，我们阐明了这一复杂的景观绘制，以对教育专家的访谈进行定性见解。这些访谈包括对过去十年中著名应用ML会议上发表的ML教育（ML4ED）论文的深入评估。我们的中心研究目标是批判性地研究这些论文的陈述或暗示教育和社会目标如何与他们解决的ML问题保持一致。也就是说，技术问题的提出，目标，方法和解释结果与手头的教育问题保持一致。我们发现，在ML生命周期的两个部分中存在跨学科的差距，并且尤其突出：从教育目标和将预测转换为干预措施的ML问题的提出。我们使用这些见解来提出扩展的ML生命周期，这也可能适用于在其他领域中使用ML。我们的工作加入了越来越多的跨教育和ML研究的荟萃分析研究，以及对ML社会影响的批判性分析。具体而言，它填补了对机器学习的主要技术理解与与学生合作和政策合作的教育研究人员的观点之间的差距。

translated by 谷歌翻译