Providing accurate estimated time of package delivery on users' purchasing pages for e-commerce platforms is of great importance to their purchasing decisions and post-purchase experiences. Although this problem shares some common issues with the conventional estimated time of arrival (ETA), it is more challenging with the following aspects: 1) Inductive inference. Models are required to predict ETA for orders with unseen retailers and addresses; 2) High-order interaction of order semantic information. Apart from the spatio-temporal features, the estimated time also varies greatly with other factors, such as the packaging efficiency of retailers, as well as the high-order interaction of these factors. In this paper, we propose an inductive graph transformer (IGT) that leverages raw feature information and structural graph data to estimate package delivery time. Different from previous graph transformer architectures, IGT adopts a decoupled pipeline and trains transformer as a regression function that can capture the multiplex information from both raw feature and dense embeddings encoded by a graph neural network (GNN). In addition, we further simplify the GNN structure by removing its non-linear activation and the learnable linear transformation matrix. The reduced parameter search space and linear information propagation in the simplified GNN enable the IGT to be applied in large-scale industrial scenarios. Experiments on real-world logistics datasets show that our proposed model can significantly outperform the state-of-the-art methods on estimation of delivery time. The source code is available at: https://github.com/enoche/IGT-WSDM23.
translated by 谷歌翻译
联邦学习(FL)的最新进展为大规模的分布式客户带来了大规模的机器学习机会,具有绩效和数据隐私保障。然而,大多数当前的工作只关注FL中央控制器的兴趣,忽略了客户的利益。这可能导致不公平,阻碍客户积极参与学习过程并损害整个流动系统的可持续性。因此,在佛罗里达州确保公平的主题吸引了大量的研究兴趣。近年来,已经提出了各种公平知识的FL(FAFL)方法,以努力实现不同观点的流体公平。但是,没有全面的调查,帮助读者能够深入了解这种跨学科领域。本文旨在提供这样的调查。通过审查本领域现有文献所采用的基本和简化的假设,提出了涵盖FL的主要步骤的FAFL方法的分类,包括客户选择,优化,贡献评估和激励分配。此外,我们讨论了实验评估FAFL方法表现的主要指标,并建议了一些未来的未来研究方向。
translated by 谷歌翻译
Existing training criteria in automatic speech recognition(ASR) permit the model to freely explore more than one time alignments between the feature and label sequences. In this paper, we use entropy to measure a model's uncertainty, i.e. how it chooses to distribute the probability mass over the set of allowed alignments. Furthermore, we evaluate the effect of entropy regularization in encouraging the model to distribute the probability mass only on a smaller subset of allowed alignments. Experiments show that entropy regularization enables a much simpler decoding method without sacrificing word error rate, and provides better time alignment quality.
translated by 谷歌翻译
Autonomous vehicles must often contend with conflicting planning requirements, e.g., safety and comfort could be at odds with each other if avoiding a collision calls for slamming the brakes. To resolve such conflicts, assigning importance ranking to rules (i.e., imposing a rule hierarchy) has been proposed, which, in turn, induces rankings on trajectories based on the importance of the rules they satisfy. On one hand, imposing rule hierarchies can enhance interpretability, but introduce combinatorial complexity to planning; while on the other hand, differentiable reward structures can be leveraged by modern gradient-based optimization tools, but are less interpretable and unintuitive to tune. In this paper, we present an approach to equivalently express rule hierarchies as differentiable reward structures amenable to modern gradient-based optimizers, thereby, achieving the best of both worlds. We achieve this by formulating rank-preserving reward functions that are monotonic in the rank of the trajectories induced by the rule hierarchy; i.e., higher ranked trajectories receive higher reward. Equipped with a rule hierarchy and its corresponding rank-preserving reward function, we develop a two-stage planner that can efficiently resolve conflicting planning requirements. We demonstrate that our approach can generate motion plans in ~7-10 Hz for various challenging road navigation and intersection negotiation scenarios.
translated by 谷歌翻译
Eliminating ghosting artifacts due to moving objects is a challenging problem in high dynamic range (HDR) imaging. In this letter, we present a hybrid model consisting of a convolutional encoder and a Transformer decoder to generate ghost-free HDR images. In the encoder, a context aggregation network and non-local attention block are adopted to optimize multi-scale features and capture both global and local dependencies of multiple low dynamic range (LDR) images. The decoder based on Swin Transformer is utilized to improve the reconstruction capability of the proposed model. Motivated by the phenomenal difference between the presence and absence of artifacts under the field of structure tensor (ST), we integrate the ST information of LDR images as auxiliary inputs of the network and use ST loss to further constrain artifacts. Different from previous approaches, our network is capable of processing an arbitrary number of input LDR images. Qualitative and quantitative experiments demonstrate the effectiveness of the proposed method by comparing it with existing state-of-the-art HDR deghosting models. Codes are available at https://github.com/pandayuanyu/HSTHdr.
translated by 谷歌翻译
With the drive to create a decentralized digital economy, Web 3.0 has become a cornerstone of digital transformation, developed on the basis of computing-force networking, distributed data storage, and blockchain. With the rapid realization of quantum devices, Web 3.0 is being developed in parallel with the deployment of quantum cloud computing and quantum Internet. In this regard, quantum computing first disrupts the original cryptographic systems that protect data security while reshaping modern cryptography with the advantages of quantum computing and communication. Therefore, in this paper, we introduce a quantum blockchain-driven Web 3.0 framework that provides information-theoretic security for decentralized data transferring and payment transactions. First, we present the framework of quantum blockchain-driven Web 3.0 with future-proof security during the transmission of data and transaction information. Next, we discuss the potential applications and challenges of implementing quantum blockchain in Web 3.0. Finally, we describe a use case for quantum non-fungible tokens (NFTs) and propose a quantum deep learning-based optimal auction for NFT trading to maximize the achievable revenue for sufficient liquidity in Web 3.0. In this way, the proposed framework can achieve proven security and sustainability for the next-generation decentralized digital society.
translated by 谷歌翻译
We explore unifying a neural segmenter with two-pass cascaded encoder ASR into a single model. A key challenge is allowing the segmenter (which runs in real-time, synchronously with the decoder) to finalize the 2nd pass (which runs 900 ms behind real-time) without introducing user-perceived latency or deletion errors during inference. We propose a design where the neural segmenter is integrated with the causal 1st pass decoder to emit a end-of-segment (EOS) signal in real-time. The EOS signal is then used to finalize the non-causal 2nd pass. We experiment with different ways to finalize the 2nd pass, and find that a novel dummy frame injection strategy allows for simultaneous high quality 2nd pass results and low finalization latency. On a real-world long-form captioning task (YouTube), we achieve 2.4% relative WER and 140 ms EOS latency gains over a baseline VAD-based segmenter with the same cascaded encoder.
translated by 谷歌翻译
Data depth, introduced by Tukey (1975), is an important tool in data science, robust statistics, and computational geometry. One chief barrier to its broader practical utility is that many common measures of depth are computationally intensive, requiring on the order of $n^d$ operations to exactly compute the depth of a single point within a data set of $n$ points in $d$-dimensional space. Often however, we are not directly interested in the absolute depths of the points, but rather in their \textit{relative ordering}. For example, we may want to find the most central point in a data set (a generalized median), or to identify and remove all outliers (points on the fringe of the data set with low depth). With this observation, we develop a novel and instance-adaptive algorithm for adaptive data depth computation by reducing the problem of exactly computing $n$ depths to an $n$-armed stochastic multi-armed bandit problem which we can efficiently solve. We focus our exposition on simplicial depth, developed by \citet{liu1990notion}, which has emerged as a promising notion of depth due to its interpretability and asymptotic properties. We provide general instance-dependent theoretical guarantees for our proposed algorithms, which readily extend to many other common measures of data depth including majority depth, Oja depth, and likelihood depth. When specialized to the case where the gaps in the data follow a power law distribution with parameter $\alpha<2$, we show that we can reduce the complexity of identifying the deepest point in the data set (the simplicial median) from $O(n^d)$ to $\tilde{O}(n^{d-(d-1)\alpha/2})$, where $\tilde{O}$ suppresses logarithmic factors. We corroborate our theoretical results with numerical experiments on synthetic data, showing the practical utility of our proposed methods.
translated by 谷歌翻译
Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose the participants to design an efficient quantized image super-resolution solution that can demonstrate a real-time performance on mobile NPUs. The participants were provided with the DIV2K dataset and trained INT8 models to do a high-quality 3X image upscaling. The runtime of all models was evaluated on the Synaptics VS680 Smart Home board with a dedicated edge NPU capable of accelerating quantized neural networks. All proposed solutions are fully compatible with the above NPU, demonstrating an up to 60 FPS rate when reconstructing Full HD resolution images. A detailed description of all models developed in the challenge is provided in this paper.
translated by 谷歌翻译
本文解决了利益区域(ROI)计算机断层扫描(CT)的图像重建问题。尽管基于模型的迭代方法可用于此问题,但由于乏味的参数化和缓慢的收敛性,它们的实用性通常受到限制。另外,当保留的先验不完全适合溶液空间时,可以获得不足的溶液。深度学习方法提供了一种快速的替代方法,从大型数据集中利用信息,因此可以达到高重建质量。但是,这些方法通常依赖于不考虑成像系统物理学的黑匣子,而且它们缺乏可解释性通常会感到沮丧。在两种方法的十字路口,最近都提出了展开的深度学习技术。它们将模型的物理和迭代优化算法纳入神经网络设计中,从而在各种应用中均具有出色的性能。本文介绍了一种新颖的,展开的深度学习方法,称为U-RDBFB,为ROI CT重建而设计为有限的数据。由于强大的非凸数据保真功能与稀疏性诱导正则化功能相结合,因此有效地处理了很少的截断数据。然后,嵌入在迭代重新加权方案中的块双重前向(DBFB)算法的迭代将在神经网络体系结构上展开,从而以监督的方式学习各种参数。我们的实验显示了对各种最新方法的改进,包括基于模型的迭代方案,深度学习体系结构和深度展开的方法。
translated by 谷歌翻译