In this work, we demonstrate the offline FPGA realization of both recurrent and feedforward neural network (NN)-based equalizers for nonlinearity compensation in coherent optical transmission systems. First, we present a realization pipeline showing the conversion of the models from Python libraries to the FPGA chip synthesis and implementation. Then, we review the main alternatives for the hardware implementation of nonlinear activation functions. The main results are divided into three parts: a performance comparison, an analysis of how activation functions are implemented, and a report on the complexity of the hardware. The performance in Q-factor is presented for the cases of bidirectional long-short-term memory coupled with convolutional NN (biLSTM + CNN) equalizer, CNN equalizer, and standard 1-StpS digital back-propagation (DBP) for the simulation and experiment propagation of a single channel dual-polarization (SC-DP) 16QAM at 34 GBd along 17x70km of LEAF. The biLSTM+CNN equalizer provides a similar result to DBP and a 1.7 dB Q-factor gain compared with the chromatic dispersion compensation baseline in the experimental dataset. After that, we assess the Q-factor and the impact of hardware utilization when approximating the activation functions of NN using Taylor series, piecewise linear, and look-up table (LUT) approximations. We also show how to mitigate the approximation errors with extra training and provide some insights into possible gradient problems in the LUT approximation. Finally, to evaluate the complexity of hardware implementation to achieve 400G throughput, fixed-point NN-based equalizers with approximated activation functions are developed and implemented in an FPGA.
translated by 谷歌翻译
To circumvent the non-parallelizability of recurrent neural network-based equalizers, we propose knowledge distillation to recast the RNN into a parallelizable feedforward structure. The latter shows 38\% latency decrease, while impacting the Q-factor by only 0.5dB.
translated by 谷歌翻译
在本文中,提出了一种新的方法,该方法允许基于神经网络(NN)均衡器的低复杂性发展,以缓解高速相干光学传输系统中的损伤。在这项工作中,我们提供了已应用于馈电和经常性NN设计的各种深层模型压缩方法的全面描述和比较。此外,我们评估了这些策略对每个NN均衡器的性能的影响。考虑量化,重量聚类,修剪和其他用于模型压缩的尖端策略。在这项工作中,我们提出并评估贝叶斯优化辅助压缩,其中选择了压缩的超参数以同时降低复杂性并提高性能。总之,通过使用模拟和实验数据来评估每种压缩方法的复杂性及其性能之间的权衡,以完成分析。通过利用最佳压缩方法,我们表明可以设计基于NN的均衡器,该均衡器比传统的数字背部传播(DBP)均衡器具有更好的性能,并且只有一个步骤。这是通过减少使用加权聚类和修剪算法后在NN均衡器中使用的乘数数量来完成的。此外,我们证明了基于NN的均衡器也可以实现卓越的性能,同时仍然保持与完整的电子色色散补偿块相同的复杂性。我们通过强调开放问题和现有挑战以及未来的研究方向来结束分析。
translated by 谷歌翻译
在本文中,我们提供了一种系统的方法来评估和比较数字信号处理中神经网络层的计算复杂性。我们提供并链接四个软件到硬件的复杂性度量,定义了不同的复杂度指标与层的超参数的关系。本文解释了如何计算这四个指标以进行馈送和经常性层,并定义在这种情况下,我们应该根据我们是否表征了面向更软件或硬件的应用程序来使用特定的度量。新引入的四个指标之一,称为“添加和位移位数(NAB)”,用于异质量化。 NABS不仅表征了操作中使用的位宽的影响,还表征了算术操作中使用的量化类型。我们打算这项工作作为与神经网络在实时数字信号处理中应用相关的复杂性估计级别(目的)的基线,旨在统一计算复杂性估计。
translated by 谷歌翻译
FPGA中首次实施了针对非线性补偿的经常性和前馈神经网络均衡器,其复杂度与分散均衡器的复杂度相当。我们证明,基于NN的均衡器可以胜过1个速度的DBP。
translated by 谷歌翻译
Process monitoring and control are essential in modern industries for ensuring high quality standards and optimizing production performance. These technologies have a long history of application in production and have had numerous positive impacts, but also hold great potential when integrated with Industry 4.0 and advanced machine learning, particularly deep learning, solutions. However, in order to implement these solutions in production and enable widespread adoption, the scalability and transferability of deep learning methods have become a focus of research. While transfer learning has proven successful in many cases, particularly with computer vision and homogenous data inputs, it can be challenging to apply to heterogeneous data. Motivated by the need to transfer and standardize established processes to different, non-identical environments and by the challenge of adapting to heterogeneous data representations, this work introduces the Domain Adaptation Neural Network with Cyclic Supervision (DBACS) approach. DBACS addresses the issue of model generalization through domain adaptation, specifically for heterogeneous data, and enables the transfer and scalability of deep learning-based statistical control methods in a general manner. Additionally, the cyclic interactions between the different parts of the model enable DBACS to not only adapt to the domains, but also match them. To the best of our knowledge, DBACS is the first deep learning approach to combine adaptation and matching for heterogeneous data settings. For comparison, this work also includes subspace alignment and a multi-view learning that deals with heterogeneous representations by mapping data into correlated latent feature spaces. Finally, DBACS with its ability to adapt and match, is applied to a virtual metrology use case for an etching process run on different machine types in semiconductor manufacturing.
translated by 谷歌翻译
An Anomaly Detection (AD) System for Self-diagnosis has been developed for Multiphase Flow Meter (MPFM). The system relies on machine learning algorithms for time series forecasting, historical data have been used to train a model and to predict the behavior of a sensor and, thus, to detect anomalies.
translated by 谷歌翻译
Building a quantum analog of classical deep neural networks represents a fundamental challenge in quantum computing. A key issue is how to address the inherent non-linearity of classical deep learning, a problem in the quantum domain due to the fact that the composition of an arbitrary number of quantum gates, consisting of a series of sequential unitary transformations, is intrinsically linear. This problem has been variously approached in the literature, principally via the introduction of measurements between layers of unitary transformations. In this paper, we introduce the Quantum Path Kernel, a formulation of quantum machine learning capable of replicating those aspects of deep machine learning typically associated with superior generalization performance in the classical domain, specifically, hierarchical feature learning. Our approach generalizes the notion of Quantum Neural Tangent Kernel, which has been used to study the dynamics of classical and quantum machine learning models. The Quantum Path Kernel exploits the parameter trajectory, i.e. the curve delineated by model parameters as they evolve during training, enabling the representation of differential layer-wise convergence behaviors, or the formation of hierarchical parametric dependencies, in terms of their manifestation in the gradient space of the predictor function. We evaluate our approach with respect to variants of the classification of Gaussian XOR mixtures - an artificial but emblematic problem that intrinsically requires multilevel learning in order to achieve optimal class separation.
translated by 谷歌翻译
Aliasing is a highly important concept in signal processing, as careful consideration of resolution changes is essential in ensuring transmission and processing quality of audio, image, and video. Despite this, up until recently aliasing has received very little consideration in Deep Learning, with all common architectures carelessly sub-sampling without considering aliasing effects. In this work, we investigate the hypothesis that the existence of adversarial perturbations is due in part to aliasing in neural networks. Our ultimate goal is to increase robustness against adversarial attacks using explainable, non-trained, structural changes only, derived from aliasing first principles. Our contributions are the following. First, we establish a sufficient condition for no aliasing for general image transformations. Next, we study sources of aliasing in common neural network layers, and derive simple modifications from first principles to eliminate or reduce it. Lastly, our experimental results show a solid link between anti-aliasing and adversarial attacks. Simply reducing aliasing already results in more robust classifiers, and combining anti-aliasing with robust training out-performs solo robust training on $L_2$ attacks with none or minimal losses in performance on $L_{\infty}$ attacks.
translated by 谷歌翻译
The problem of generating an optimal coalition structure for a given coalition game of rational agents is to find a partition that maximizes their social welfare and is known to be NP-hard. This paper proposes GCS-Q, a novel quantum-supported solution for Induced Subgraph Games (ISGs) in coalition structure generation. GCS-Q starts by considering the grand coalition as initial coalition structure and proceeds by iteratively splitting the coalitions into two nonempty subsets to obtain a coalition structure with a higher coalition value. In particular, given an $n$-agent ISG, the GCS-Q solves the optimal split problem $\mathcal{O} (n)$ times using a quantum annealing device, exploring $\mathcal{O}(2^n)$ partitions at each step. We show that GCS-Q outperforms the currently best classical solvers with its runtime in the order of $n^2$ and an expected worst-case approximation ratio of $93\%$ on standard benchmark datasets.
translated by 谷歌翻译