智能论文笔记

DeltaZ: An Accessible Compliant Delta Robot Manipulator for Research and Education

Sarvesh Patil , Samuel C. Alvares , Pragna Mannam , Oliver Kroemer , F. Zeynep Temel

分类：机器人

2022-07-02

本文介绍了Deltaz机器人，这是一种厘米级，低成本，三角洲风格的机器人，可提供广泛的功能和鲁棒的功能。当前的技术使三角洲可以通过柔软和刚性材料进行3D印刷，从而易于组装和维护，并降低使用的障碍。机器人的功能源于其三个翻译自由度和一个封闭形式的运动解，这使操作问题与其他操纵器相比更加直观。此外，机器人的低成本为将操纵者民主化为研究环境提供了机会。我们还描述了如何将机器人用作增强学习基准。开源3D打印机设计和代码可向公众使用。

translated by 谷歌翻译

Linear Delta Arrays for Dexterous Distributed Manipulation

Sarvesh Patil , Tony Tao , Tess Hellebrekers , Oliver Kroemer , F. Zeynep Temel

分类：机器人

2022-06-09

本文介绍了一种新型的分布式灵巧操纵器：三角洲阵列。每个三角洲阵列都由线性驱动的三角形机器人的网格组成，并具有符合性的3D打印的平行四边形链接。这些阵列可用于执行类似于智能输送机的平面运输任务。但是，三角洲的额外自由度也提供了各种不同的平面操作，以及在三角洲集合之间的预感。因此，三角洲阵列提供了广泛的分布式操作策略。在本文中，我们介绍了三角阵列的设计，包括单个三角洲，模块化阵列结构以及分布式通信和控制。我们还使用拟议的设计构建和评估了8x8阵列。我们的评估表明，由此产生的192 DOF机器人能够对各种对象进行各种协调的分布操作，包括翻译，对齐和预性挤压。

translated by 谷歌翻译

Learning to Singulate Layers of Cloth using Tactile Feedback

Sashank Tirumala , Thomas Weng , Daniel Seita , Oliver Kroemer , Zeynep Temel , David Held

分类：机器人

2022-07-22

布料的机器人操作的应用包括织物制造业到处理毯子和洗衣。布料操作对于机器人而言是挑战，这主要是由于它们的高度自由度，复杂的动力学和折叠或皱巴巴配置时的严重自我闭合。机器人操作的先前工作主要依赖于视觉传感器，这可能会对细粒度的操纵任务构成挑战，例如从一堆布上抓住所需数量的布料层。在本文中，我们建议将触觉传感用于布操作；我们将触觉传感器（Resin）连接到弗兰卡机器人的两个指尖之一，并训练分类器，以确定机器人是否正在抓住特定数量的布料层。在测试时间实验中，机器人使用此分类器作为其政策的一部分，使用触觉反馈来掌握一两个布层，以确定合适的握把。实验结果超过180次物理试验表明，与使用图像分类器的方法相比，所提出的方法优于不使用触觉反馈并具有更好地看不见布的基准。代码，数据和视频可在https://sites.google.com/view/reskin-cloth上找到。

translated by 谷歌翻译

Multidimensional Item Response Theory in the Style of Collaborative Filtering

Yoav Bergner , Peter F. Halpin , Jill-Jênn Vie

分类： (统计)机器学习 | 机器学习

2023-01-03

This paper presents a machine learning approach to multidimensional item response theory (MIRT), a class of latent factor models that can be used to model and predict student performance from observed assessment data. Inspired by collaborative filtering, we define a general class of models that includes many MIRT models. We discuss the use of penalized joint maximum likelihood (JML) to estimate individual models and cross-validation to select the best performing model. This model evaluation process can be optimized using batching techniques, such that even sparse large-scale data can be analyzed efficiently. We illustrate our approach with simulated and real data, including an example from a massive open online course (MOOC). The high-dimensional model fit to this large and sparse dataset does not lend itself well to traditional methods of factor interpretation. By analogy to recommender-system applications, we propose an alternative "validation" of the factor model, using auxiliary information about the popularity of items consulted during an open-book exam in the course.

translated by 谷歌翻译

3DSGrasp: 3D Shape-Completion for Robotic Grasp

Seyed S. Mohammadi , Nuno F. Duarte , Dimitris Dimou , Yiming Wang , Matteo Taiana , Pietro Morerio , Atabak Dehban , Plinio Moreno , Alexandre Bernardino , Alessio Del Bue

分类：机器人 | 人工智能

2023-01-02

Real-world robotic grasping can be done robustly if a complete 3D Point Cloud Data (PCD) of an object is available. However, in practice, PCDs are often incomplete when objects are viewed from few and sparse viewpoints before the grasping action, leading to the generation of wrong or inaccurate grasp poses. We propose a novel grasping strategy, named 3DSGrasp, that predicts the missing geometry from the partial PCD to produce reliable grasp poses. Our proposed PCD completion network is a Transformer-based encoder-decoder network with an Offset-Attention layer. Our network is inherently invariant to the object pose and point's permutation, which generates PCDs that are geometrically consistent and completed properly. Experiments on a wide range of partial PCD show that 3DSGrasp outperforms the best state-of-the-art method on PCD completion tasks and largely improves the grasping success rate in real-world scenarios. The code and dataset will be made available upon acceptance.

translated by 谷歌翻译

Spectral Bandwidth Recovery of Optical Coherence Tomography Images using Deep Learning

Timothy T. Yu , Da Ma , Jayden Cole , Myeong Jin Ju , Mirza F. Beg , Marinko V. Sarunic

分类：人工智能 | 计算机视觉

2023-01-02

Optical coherence tomography (OCT) captures cross-sectional data and is used for the screening, monitoring, and treatment planning of retinal diseases. Technological developments to increase the speed of acquisition often results in systems with a narrower spectral bandwidth, and hence a lower axial resolution. Traditionally, image-processing-based techniques have been utilized to reconstruct subsampled OCT data and more recently, deep-learning-based methods have been explored. In this study, we simulate reduced axial scan (A-scan) resolution by Gaussian windowing in the spectral domain and investigate the use of a learning-based approach for image feature reconstruction. In anticipation of the reduced resolution that accompanies wide-field OCT systems, we build upon super-resolution techniques to explore methods to better aid clinicians in their decision-making to improve patient outcomes, by reconstructing lost features using a pixel-to-pixel approach with an altered super-resolution generative adversarial network (SRGAN) architecture.

translated by 谷歌翻译

Structural State Translation: Condition Transfer between Civil Structures Using Domain-Generalization for Structural Health Monitoring

Furkan Luleci , F. Necati Catbas

分类：机器学习 | 人工智能

2022-12-28

Using Structural Health Monitoring (SHM) systems with extensive sensing arrangements on every civil structure can be costly and impractical. Various concepts have been introduced to alleviate such difficulties, such as Population-based SHM (PBSHM). Nevertheless, the studies presented in the literature do not adequately address the challenge of accessing the information on different structural states (conditions) of dissimilar civil structures. The study herein introduces a novel framework named Structural State Translation (SST), which aims to estimate the response data of different civil structures based on the information obtained from a dissimilar structure. SST can be defined as Translating a state of one civil structure to another state after discovering and learning the domain-invariant representation in the source domains of a dissimilar civil structure. SST employs a Domain-Generalized Cycle-Generative (DGCG) model to learn the domain-invariant representation in the acceleration datasets obtained from a numeric bridge structure that is in two different structural conditions. In other words, the model is tested on three dissimilar numeric bridge models to translate their structural conditions. The evaluation results of SST via Mean Magnitude-Squared Coherence (MMSC) and modal identifiers showed that the translated bridge states (synthetic states) are significantly similar to the real ones. As such, the minimum and maximum average MMSC values of real and translated bridge states are 91.2% and 97.1%, the minimum and the maximum difference in natural frequencies are 5.71% and 0%, and the minimum and maximum Modal Assurance Criterion (MAC) values are 0.998 and 0.870. This study is critical for data scarcity and PBSHM, as it demonstrates that it is possible to obtain data from structures while the structure is actually in a different condition or state.

translated by 谷歌翻译

Word Embedding Neural Networks to Advance Knee Osteoarthritis Research

Soheyla Amirian , Husam Ghazaleh , Mehdi Assefi , Hilal Maradit Kremers , Hamid R. Arabnia , Johannes F. Plate , Ahmad P. Tafti

分类：人工智能 | 机器学习

2022-12-22

Osteoarthritis (OA) is the most prevalent chronic joint disease worldwide, where knee OA takes more than 80% of commonly affected joints. Knee OA is not a curable disease yet, and it affects large columns of patients, making it costly to patients and healthcare systems. Etiology, diagnosis, and treatment of knee OA might be argued by variability in its clinical and physical manifestations. Although knee OA carries a list of well-known terminology aiming to standardize the nomenclature of the diagnosis, prognosis, treatment, and clinical outcomes of the chronic joint disease, in practice there is a wide range of terminology associated with knee OA across different data sources, including but not limited to biomedical literature, clinical notes, healthcare literacy, and health-related social media. Among these data sources, the scientific articles published in the biomedical literature usually make a principled pipeline to study disease. Rapid yet, accurate text mining on large-scale scientific literature may discover novel knowledge and terminology to better understand knee OA and to improve the quality of knee OA diagnosis, prevention, and treatment. The present works aim to utilize artificial neural network strategies to automatically extract vocabularies associated with knee OA diseases. Our finding indicates the feasibility of developing word embedding neural networks for autonomous keyword extraction and abstraction of knee OA.

translated by 谷歌翻译

Semi-supervised GAN for Bladder Tissue Classification in Multi-Domain Endoscopic Images

Jorge F. Lazo , Benoit Rosa , Michele Catellani , Matteo Fontana , Francesco A. Mistretta , Gennaro Musi , Ottavio de Cobelli , Michel de Mathelin , Elena De Momi

分类：计算机视觉 | 机器学习

2022-12-21

Objective: Accurate visual classification of bladder tissue during Trans-Urethral Resection of Bladder Tumor (TURBT) procedures is essential to improve early cancer diagnosis and treatment. During TURBT interventions, White Light Imaging (WLI) and Narrow Band Imaging (NBI) techniques are used for lesion detection. Each imaging technique provides diverse visual information that allows clinicians to identify and classify cancerous lesions. Computer vision methods that use both imaging techniques could improve endoscopic diagnosis. We address the challenge of tissue classification when annotations are available only in one domain, in our case WLI, and the endoscopic images correspond to an unpaired dataset, i.e. there is no exact equivalent for every image in both NBI and WLI domains. Method: We propose a semi-surprised Generative Adversarial Network (GAN)-based method composed of three main components: a teacher network trained on the labeled WLI data; a cycle-consistency GAN to perform unpaired image-to-image translation, and a multi-input student network. To ensure the quality of the synthetic images generated by the proposed GAN we perform a detailed quantitative, and qualitative analysis with the help of specialists. Conclusion: The overall average classification accuracy, precision, and recall obtained with the proposed method for tissue classification are 0.90, 0.88, and 0.89 respectively, while the same metrics obtained in the unlabeled domain (NBI) are 0.92, 0.64, and 0.94 respectively. The quality of the generated images is reliable enough to deceive specialists. Significance: This study shows the potential of using semi-supervised GAN-based classification to improve bladder tissue classification when annotations are limited in multi-domain data.

translated by 谷歌翻译

End-to-End Automatic Speech Recognition model for the Sudanese Dialect

Ayman Mansour , Wafaa F. Mukhtar

分类：自然语言处理

2022-12-21

Designing a natural voice interface rely mostly on Speech recognition for interaction between human and their modern digital life equipment. In addition, speech recognition narrows the gap between monolingual individuals to better exchange communication. However, the field lacks wide support for several universal languages and their dialects, while most of the daily conversations are carried out using them. This paper comes to inspect the viability of designing an Automatic Speech Recognition model for the Sudanese dialect, which is one of the Arabic Language dialects, and its complexity is a product of historical and social conditions unique to its speakers. This condition is reflected in both the form and content of the dialect, so this paper gives an overview of the Sudanese dialect and the tasks of collecting represented resources and pre-processing performed to construct a modest dataset to overcome the lack of annotated data. Also proposed end- to-end speech recognition model, the design of the model was formed using Convolution Neural Networks. The Sudanese dialect dataset would be a stepping stone to enable future Natural Language Processing research targeting the dialect. The designed model provided some insights into the current recognition task and reached an average Label Error Rate of 73.67%.

translated by 谷歌翻译