智能论文笔记

FEL: High Capacity Learning for Recommendation and Ranking via Federated Ensemble Learning

Meisam Hejazinia , Dzmitry Huba , Ilias Leontiadis , Kiwan Maeng , Mani Malek , Luca Melis , Ilya Mironov , Milad Nasr , Kaikai Wang , Carole-Jean Wu

分类：机器学习

2022-06-07

联邦学习（FL）已成为解决消费者隐私需求的有效方法。 FL已成功应用于某些机器学习任务，例如训练智能键盘模型和关键字发现。尽管FL最初取得了成功，但许多重要的深度学习用例（例如排名和推荐任务）受到了设备学习的限制。实际采用基于DL的排名和建议所面临的主要挑战之一是现代移动系统无法满足的高度资源要求。我们建议联合合奏学习（FEL）作为解决深度学习排名和推荐任务的庞大记忆要求的解决方案。 FEL通过同时在客户端设备的分离群中训练多个模型版本，从而实现大规模排名和建议模型培训。 FEL通过拱门层将受过训练的子模型集成到服务器上托管的集合模型中。我们的实验表明，与传统的联合学习设备相比，FEL导致0.43-2.31％的模型质量改进 - 对排名和建议系统用例的重大改进。

translated by 谷歌翻译

DynO: Dynamic Onloading of Deep Neural Networks from Cloud to Device

Mario Almeida , Stefanos Laskaridis , Stylianos I. Venieris , Ilias Leontiadis , Nicholas D. Lane

分类：计算机视觉 | 机器学习

2021-04-20

最近，使用卷积神经网络（CNNS）存在移动和嵌入式应用的爆炸性增长。为了减轻其过度的计算需求，开发人员传统上揭示了云卸载，突出了高基础设施成本以及对网络条件的强烈依赖。另一方面，强大的SOC的出现逐渐启用设备执行。尽管如此，低端和中层平台仍然努力充分运行最先进的CNN。在本文中，我们展示了Dyno，一种分布式推断框架，将两全其人的最佳框架结合起来解决了几个挑战，例如设备异质性，不同的带宽和多目标要求。启用这是其新的CNN特定数据包装方法，其在onloading计算时利用CNN的不同部分的精度需求的可变性以及其新颖的调度器，该调度器共同调谐分区点并在运行时传输数据精度适应其执行环境的推理。定量评估表明，Dyno优于当前最先进的，通过竞争对手的CNN卸载系统，在竞争对手的CNN卸载系统上提高吞吐量超过一个数量级，最高可达60倍的数据。

translated by 谷歌翻译

FjORD: Fair and Accurate Federated Learning under heterogeneous targets with Ordered Dropout

Samuel Horvath , Stefanos Laskaridis , Mario Almeida , Ilias Leontiadis , Stylianos I. Venieris , Nicholas D. Lane

分类：机器学习

2021-02-26

联邦学习（FL）一直在不同的ML任务中获得显着的牵引力，从视野到键盘预测。在大规模的部署中，客户异质性是一个事实，并构成公平，培训性能和准确性的主要问题。虽然已经进行了统计数据异质性的重大努力，但是作为系统异质性称为客户端的处理能力和网络带宽的多样性仍然很大程度上是未开发的。当前解决方案无论是忽略大部分可用的设备，也无限制地设定均匀限制，由最低能力的参与者限制。在这项工作中，我们介绍了有序的辍学，这是一种机制，实现了深度神经网络（DNN）中的有序，嵌套的知识表示，并且能够在不需要再培训的情况下提取较低的脚印子模型。我们进一步表明，对于线性地图，我们的订购辍学等同于SVD。我们采用这种技术，以及一种自蒸馏方法，在一个叫做峡湾的框架中。 Fjord通过将模型宽度定制到客户端的功能来减轻客户体系异质性的问题。在各种方式上对CNN和RNN的广泛评估表明，峡湾始终如一地导致最先进的基线的显着性能，同时保持其嵌套结构。

translated by 谷歌翻译

A Nearly Tight Bound for Fitting an Ellipsoid to Gaussian Random Points

Daniel M. Kane , Ilias Diakonikolas

分类：机器学习 | (统计)机器学习

2022-12-21

We prove that for $c>0$ a sufficiently small universal constant that a random set of $c d^2/\log^4(d)$ independent Gaussian random points in $\mathbb{R}^d$ lie on a common ellipsoid with high probability. This nearly establishes a conjecture of~\cite{SaundersonCPW12}, within logarithmic factors. The latter conjecture has attracted significant attention over the past decade, due to its connections to machine learning and sum-of-squares lower bounds for certain statistical problems.

translated by 谷歌翻译

A Strongly Polynomial Algorithm for Approximate Forster Transforms and its Application to Halfspace Learning

Ilias Diakonikolas , Christos Tzamos , Daniel M. Kane

分类：机器学习 | (统计)机器学习

2022-12-06

The Forster transform is a method of regularizing a dataset by placing it in {\em radial isotropic position} while maintaining some of its essential properties. Forster transforms have played a key role in a diverse range of settings spanning computer science and functional analysis. Prior work had given {\em weakly} polynomial time algorithms for computing Forster transforms, when they exist. Our main result is the first {\em strongly polynomial time} algorithm to compute an approximate Forster transform of a given dataset or certify that no such transformation exists. By leveraging our strongly polynomial Forster algorithm, we obtain the first strongly polynomial time algorithm for {\em distribution-free} PAC learning of halfspaces. This learning result is surprising because {\em proper} PAC learning of halfspaces is {\em equivalent} to linear programming. Our learning approach extends to give a strongly polynomial halfspace learner in the presence of random classification noise and, more generally, Massart noise.

translated by 谷歌翻译

Evaluating Digital Agriculture Recommendations with Causal Inference

Ilias Tsoumas , Georgios Giannarakis , Vasileios Sitokonstantinou , Alkiviadis Koukos , Dimitra Loka , Nikolaos Bartsotas , Charalampos Kontoes , Ioannis Athanasiadis

分类：机器学习 | 人工智能

2022-11-30

In contrast to the rapid digitalization of several industries, agriculture suffers from low adoption of smart farming tools. While AI-driven digital agriculture tools can offer high-performing predictive functionalities, they lack tangible quantitative evidence on their benefits to the farmers. Field experiments can derive such evidence, but are often costly, time consuming and hence limited in scope and scale of application. To this end, we propose an observational causal inference framework for the empirical evaluation of the impact of digital tools on target farm performance indicators (e.g., yield in this case). This way, we can increase farmers' trust via enhancing the transparency of the digital agriculture market and accelerate the adoption of technologies that aim to secure farmer income resilience and global agricultural sustainability. As a case study, we designed and implemented a recommendation system for the optimal sowing time of cotton based on numerical weather predictions, which was used by a farmers' cooperative during the growing season of 2021. We then leverage agricultural knowledge, collected yield data, and environmental information to develop a causal graph of the farm system. Using the back-door criterion, we identify the impact of sowing recommendations on the yield and subsequently estimate it using linear regression, matching, inverse propensity score weighting and meta-learners. The results reveal that a field sown according to our recommendations exhibited a statistically significant yield increase that ranged from 12% to 17%, depending on the method. The effect estimates were robust, as indicated by the agreement among the estimation methods and four successful refutation tests. We argue that this approach can be implemented for decision support systems of other fields, extending their evaluation beyond a performance assessment of internal functionalities.

translated by 谷歌翻译

Outlier-Robust Sparse Mean Estimation for Heavy-Tailed Distributions

Ilias Diakonikolas , Daniel M. Kane , Jasper C. H. Lee , Ankit Pensia

分类：机器学习 | (统计)机器学习

2022-11-29

We study the fundamental task of outlier-robust mean estimation for heavy-tailed distributions in the presence of sparsity. Specifically, given a small number of corrupted samples from a high-dimensional heavy-tailed distribution whose mean $\mu$ is guaranteed to be sparse, the goal is to efficiently compute a hypothesis that accurately approximates $\mu$ with high probability. Prior work had obtained efficient algorithms for robust sparse mean estimation of light-tailed distributions. In this work, we give the first sample-efficient and polynomial-time robust sparse mean estimator for heavy-tailed distributions under mild moment assumptions. Our algorithm achieves the optimal asymptotic error using a number of samples scaling logarithmically with the ambient dimension. Importantly, the sample complexity of our method is optimal as a function of the failure probability $\tau$, having an additive $\log(1/\tau)$ dependence. Our algorithm leverages the stability-based approach from the algorithmic robust statistics literature, with crucial (and necessary) adaptations required in our setting. Our analysis may be of independent interest, involving the delicate design of a (non-spectral) decomposition for positive semi-definite matrices satisfying certain sparsity properties.

translated by 谷歌翻译

Fuzzy clustering for the within-season estimation of cotton phenology

Vasileios Sitokonstantinou , Alkiviadis Koukos , Ilias Tsoumas , Nikolaos S. Bartsotas , Charalampos Kontoes , Vassilia Karathanassi

分类：机器学习

2022-11-25

Crop phenology is crucial information for crop yield estimation and agricultural management. Traditionally, phenology has been observed from the ground; however Earth observation, weather and soil data have been used to capture the physiological growth of crops. In this work, we propose a new approach for the within-season phenology estimation for cotton at the field level. For this, we exploit a variety of Earth observation vegetation indices (derived from Sentinel-2) and numerical simulations of atmospheric and soil parameters. Our method is unsupervised to address the ever-present problem of sparse and scarce ground truth data that makes most supervised alternatives impractical in real-world scenarios. We applied fuzzy c-means clustering to identify the principal phenological stages of cotton and then used the cluster membership weights to further predict the transitional phases between adjacent stages. In order to evaluate our models, we collected 1,285 crop growth ground observations in Orchomenos, Greece. We introduced a new collection protocol, assigning up to two phenology labels that represent the primary and secondary growth stage in the field and thus indicate when stages are transitioning. Our model was tested against a baseline model that allowed to isolate the random agreement and evaluate its true competence. The results showed that our model considerably outperforms the baseline one, which is promising considering the unsupervised nature of the approach. The limitations and the relevant future work are thoroughly discussed. The ground observations are formatted in an ready-to-use dataset and will be available at https://github.com/Agri-Hub/cotton-phenology-dataset upon publication.

translated by 谷歌翻译

A Multimodal Approach for Dementia Detection from Spontaneous Speech with Tensor Fusion Layer

Loukas Ilias , Dimitris Askounis , John Psarras

分类：自然语言处理 | 计算机视觉

2022-11-08

Alzheimer's disease (AD) is a progressive neurological disorder, meaning that the symptoms develop gradually throughout the years. It is also the main cause of dementia, which affects memory, thinking skills, and mental abilities. Nowadays, researchers have moved their interest towards AD detection from spontaneous speech, since it constitutes a time-effective procedure. However, existing state-of-the-art works proposing multimodal approaches do not take into consideration the inter- and intra-modal interactions and propose early and late fusion approaches. To tackle these limitations, we propose deep neural networks, which can be trained in an end-to-end trainable way and capture the inter- and intra-modal interactions. Firstly, each audio file is converted to an image consisting of three channels, i.e., log-Mel spectrogram, delta, and delta-delta. Next, each transcript is passed through a BERT model followed by a gated self-attention layer. Similarly, each image is passed through a Swin Transformer followed by an independent gated self-attention layer. Acoustic features are extracted also from each audio file. Finally, the representation vectors from the different modalities are fed to a tensor fusion layer for capturing the inter-modal interactions. Extensive experiments conducted on the ADReSS Challenge dataset indicate that our introduced approaches obtain valuable advantages over existing research initiatives reaching Accuracy and F1-score up to 86.25% and 85.48% respectively.

translated by 谷歌翻译

Evaluating Digital Tools for Sustainable Agriculture using Causal Inference

Ilias Tsoumas , Georgios Giannarakis , Vasileios Sitokonstantinou , Alkiviadis Koukos , Dimitra Loka , Nikolaos Bartsotas , Charalampos Kontoes , Ioannis Athanasiadis

分类：机器学习 | 人工智能

2022-11-06

In contrast to the rapid digitalization of several industries, agriculture suffers from low adoption of climate-smart farming tools. Even though AI-driven digital agriculture can offer high-performing predictive functionalities, it lacks tangible quantitative evidence on its benefits to the farmers. Field experiments can derive such evidence, but are often costly and time consuming. To this end, we propose an observational causal inference framework for the empirical evaluation of the impact of digital tools on target farm performance indicators. This way, we can increase farmers' trust by enhancing the transparency of the digital agriculture market, and in turn accelerate the adoption of technologies that aim to increase productivity and secure a sustainable and resilient agriculture against a changing climate. As a case study, we perform an empirical evaluation of a recommendation system for optimal cotton sowing, which was used by a farmers' cooperative during the growing season of 2021. We leverage agricultural knowledge to develop a causal graph of the farm system, we use the back-door criterion to identify the impact of recommendations on the yield and subsequently estimate it using several methods on observational data. The results show that a field sown according to our recommendations enjoyed a significant increase in yield (12% to 17%).

translated by 谷歌翻译