We design and implement an adaptive machine learning equalizer that alternates multiple linear and nonlinear computational layers on an FPGA. On-chip training via gradient backpropagation is shown to allow for real-time adaptation to time-varying channel impairments.
translated by 谷歌翻译
In this work, we demonstrate the offline FPGA realization of both recurrent and feedforward neural network (NN)-based equalizers for nonlinearity compensation in coherent optical transmission systems. First, we present a realization pipeline showing the conversion of the models from Python libraries to the FPGA chip synthesis and implementation. Then, we review the main alternatives for the hardware implementation of nonlinear activation functions. The main results are divided into three parts: a performance comparison, an analysis of how activation functions are implemented, and a report on the complexity of the hardware. The performance in Q-factor is presented for the cases of bidirectional long-short-term memory coupled with convolutional NN (biLSTM + CNN) equalizer, CNN equalizer, and standard 1-StpS digital back-propagation (DBP) for the simulation and experiment propagation of a single channel dual-polarization (SC-DP) 16QAM at 34 GBd along 17x70km of LEAF. The biLSTM+CNN equalizer provides a similar result to DBP and a 1.7 dB Q-factor gain compared with the chromatic dispersion compensation baseline in the experimental dataset. After that, we assess the Q-factor and the impact of hardware utilization when approximating the activation functions of NN using Taylor series, piecewise linear, and look-up table (LUT) approximations. We also show how to mitigate the approximation errors with extra training and provide some insights into possible gradient problems in the LUT approximation. Finally, to evaluate the complexity of hardware implementation to achieve 400G throughput, fixed-point NN-based equalizers with approximated activation functions are developed and implemented in an FPGA.
translated by 谷歌翻译
在本文中,提出了一种新的方法,该方法允许基于神经网络(NN)均衡器的低复杂性发展,以缓解高速相干光学传输系统中的损伤。在这项工作中,我们提供了已应用于馈电和经常性NN设计的各种深层模型压缩方法的全面描述和比较。此外,我们评估了这些策略对每个NN均衡器的性能的影响。考虑量化,重量聚类,修剪和其他用于模型压缩的尖端策略。在这项工作中,我们提出并评估贝叶斯优化辅助压缩,其中选择了压缩的超参数以同时降低复杂性并提高性能。总之,通过使用模拟和实验数据来评估每种压缩方法的复杂性及其性能之间的权衡,以完成分析。通过利用最佳压缩方法,我们表明可以设计基于NN的均衡器,该均衡器比传统的数字背部传播(DBP)均衡器具有更好的性能,并且只有一个步骤。这是通过减少使用加权聚类和修剪算法后在NN均衡器中使用的乘数数量来完成的。此外,我们证明了基于NN的均衡器也可以实现卓越的性能,同时仍然保持与完整的电子色色散补偿块相同的复杂性。我们通过强调开放问题和现有挑战以及未来的研究方向来结束分析。
translated by 谷歌翻译
我们根据光学通信中的载体回收率的变异推断研究了自适应盲人均衡器的潜力。这些均衡器基于最大似然通道估计的低复杂性近似。我们将变异自动编码器(VAE)均衡器的概念概括为包括概率星座塑形(PCS)的高阶调制格式,无处不在,在光学通信中,对接收器进行过度采样和双极化传输。除了基于卷积神经网络的黑盒均衡器外,我们还提出了基于线性蝴蝶滤波器的基于模型的均衡器,并使用变异推理范式训练过滤器系数。作为副产品,VAE还提供了可靠的通道估计。我们在具有符号间干扰(ISI)的经典添加剂白色高斯噪声(AWGN)通道和色散线性光学双极化通道上分析了VAE的性能和灵活性。我们表明,对于固定的固定通道但也随时间变化的通道,它可以超越最先进的恒定算法(CMA)来扩展盲人自适应均衡器的应用范围。评估伴随着超参数分析。
translated by 谷歌翻译
FPGA中首次实施了针对非线性补偿的经常性和前馈神经网络均衡器,其复杂度与分散均衡器的复杂度相当。我们证明,基于NN的均衡器可以胜过1个速度的DBP。
translated by 谷歌翻译
正交频分复用(OFDM)已广泛应用于当前通信系统。人工智能(AI)addm接收器目前被带到最前沿替换和改进传统的OFDM接收器。在这项研究中,我们首先比较两个AI辅助OFDM接收器,即数据驱动的完全连接的深神经网络和模型驱动的COMNet,通过广泛的仿真和实时视频传输,使用5G快速原型制作系统进行跨越式-Air(OTA)测试。我们在离线训练和真实环境之间的频道模型之间的差异差异导致的模拟和OTA测试之间找到了性能差距。我们开发一种新颖的在线培训系统,称为SwitchNet接收器,以解决此问题。该接收器具有灵活且可扩展的架构,可以通过在线训练几个参数来适应真实频道。从OTA测试中,AI辅助OFDM接收器,尤其是SwitchNet接收器,对真实环境具有鲁棒,并且对未来的通信系统有前途。我们讨论了本文初步研究的潜在挑战和未来的研究。
translated by 谷歌翻译
我们提出了一个新颖的框架,用于设计无乘数内核机器,该机器可以在智能边缘设备等资源约束平台上使用。该框架使用基于边缘传播(MP)技术的分段线性(PWL)近似值,仅使用加法/减法,移位,比较和寄存器底流/溢出操作。我们建议使用针对现场可编程门阵列(FPGA)平台进行优化的基于硬件的MP推理和在线培训算法。我们的FPGA实施消除了对DSP单元的需求,并减少了LUT的数量。通过重复使用相同的硬件进行推理和培训,我们表明该平台可以克服由MP近似产生的分类错误和本地最小值。该提议的无乘数MP-Kernel机器在FPGA上的实施导致估计的能源消耗为13.4 PJ,功率消耗为107 MW,每台均具有〜9K LUTS和FFS,每张均具有256 x 32个大小的核与其他可比实现相比,区域和区域。
translated by 谷歌翻译
由于自动驾驶,物联网和流媒体服务的快速发展,现代通信系统必须应对各种渠道条件以及用户和设备的稳步增加。这以及仍在上升的带宽需求只能通过智能网络自动化来满足,这需要高度灵活和盲目的收发器算法。为了应对这些挑战,我们提出了一种新颖的自适应均衡计划,该计划通过训练用对抗性网络训练均衡器来利用深度学习的繁荣进步。该学习仅基于发射信号的统计数据,因此它对通道模型的实际发送符号和不可知论是盲目的。所提出的方法独立于均衡器拓扑,并实现了强大的基于神经网络的均衡器的应用。在这项工作中,我们证明了这一概念在对线性和非线性传输通道的模拟中,并证明了拟议的盲目学习方案的能力,可以接近非盲均衡器的性能。此外,我们提供了理论观点,并强调了方法的挑战。
translated by 谷歌翻译
With an ever-growing number of parameters defining increasingly complex networks, Deep Learning has led to several breakthroughs surpassing human performance. As a result, data movement for these millions of model parameters causes a growing imbalance known as the memory wall. Neuromorphic computing is an emerging paradigm that confronts this imbalance by performing computations directly in analog memories. On the software side, the sequential Backpropagation algorithm prevents efficient parallelization and thus fast convergence. A novel method, Direct Feedback Alignment, resolves inherent layer dependencies by directly passing the error from the output to each layer. At the intersection of hardware/software co-design, there is a demand for developing algorithms that are tolerable to hardware nonidealities. Therefore, this work explores the interrelationship of implementing bio-plausible learning in-situ on neuromorphic hardware, emphasizing energy, area, and latency constraints. Using the benchmarking framework DNN+NeuroSim, we investigate the impact of hardware nonidealities and quantization on algorithm performance, as well as how network topologies and algorithm-level design choices can scale latency, energy and area consumption of a chip. To the best of our knowledge, this work is the first to compare the impact of different learning algorithms on Compute-In-Memory-based hardware and vice versa. The best results achieved for accuracy remain Backpropagation-based, notably when facing hardware imperfections. Direct Feedback Alignment, on the other hand, allows for significant speedup due to parallelization, reducing training time by a factor approaching N for N-layered networks.
translated by 谷歌翻译
最近在无线通信领域的许多任务中考虑了机器学习算法。以前,我们已经提出了使用深度卷积神经网络(CNN)进行接收器处理的使用,并证明它可以提供可观的性能提高。在这项研究中,我们专注于发射器的机器学习算法。特别是,我们考虑进行波束形成并提出一个CNN,该CNN对于给定上行链路通道估计值作为输入,输出下链路通道信息用于波束成形。考虑到基于UE接收器性能的损失函数的上行链路传输和下行链路传输,CNN以有监督的方式进行培训。神经网络的主要任务是预测上行链路和下行链路插槽之间的通道演变,但它也可以学会处理整个链中的效率低下和错误,包括实际的光束成型阶段。提供的数值实验证明了波束形成性能的改善。
translated by 谷歌翻译
配备高速数字化器的前端电子设备正在使用并建议将来的核检测器。最近的文献表明,在处理来自核检测器的数字信号时,深度学习模型,尤其是一维卷积神经网络。模拟和实验证明了该领域神经网络的令人满意的准确性和其他好处。但是,仍需要研究特定的硬件加速在线操作。在这项工作中,我们介绍了Pulsedl-II,这是一种专门设计的,专门为事件功能(时间,能量等)从具有深度学习的脉冲中提取的应用。根据先前的版本,PULSEDL-II将RISC CPU纳入系统结构,以更好地功能灵活性和完整性。 SOC中的神经网络加速器采用三级(算术单元,处理元件,神经网络)层次结构,并促进数字设计的参数优化。此外,我们设计了一种量化方案和相关的实现方法(恢复和位移位),以在所选层类型的选定子集中与深度学习框架(例如Tensorflow)完全兼容。通过当前方案,支持神经网络的量化训练,并通过专用脚本自动将网络模型转换为RISC CPU软件,几乎没有准确性损失。我们在现场可编程门阵列(FPGA)上验证pulsedl-ii。最后,通过由直接数字合成(DDS)信号发生器和带有模数转换器(ADC)的FPGA开发板组成的实验设置进行系统验证。拟议的系统实现了60 PS的时间分辨率和0.40%的能量分辨率,在线神经网络推断在信号与噪声比(SNR)为47.4 dB时。
translated by 谷歌翻译
尖峰神经网络(SNN)提供了一个新的计算范式,能够高度平行,实时处理。光子设备是设计与SNN计算范式相匹配的高带宽,平行体系结构的理想选择。 CMO和光子元件的协整允许将低损耗的光子设备与模拟电子设备结合使用,以更大的非线性计算元件的灵活性。因此,我们在整体硅光子学(SIPH)过程上设计和模拟了光电尖峰神经元电路,该过程复制了超出泄漏的集成和火(LIF)之外有用的尖峰行为。此外,我们探索了两种学习算法,具有使用Mach-Zehnder干涉法(MZI)网格作为突触互连的片上学习的潜力。实验证明了随机反向传播(RPB)的变体,并在简单分类任务上与标准线性回归的性能相匹配。同时,将对比性HEBBIAN学习(CHL)规则应用于由MZI网格组成的模拟神经网络,以进行随机输入输出映射任务。受CHL训练的MZI网络的性能比随机猜测更好,但不符合理想神经网络的性能(没有MZI网格施加的约束)。通过这些努力,我们证明了协调的CMO和SIPH技术非常适合可扩展的SNN计算体系结构的设计。
translated by 谷歌翻译
我们考虑在线性符号间干扰通道上使用因子图框架的符号检测的应用。基于Ungerboeck观察模型,可以得出具有吸引人复杂性能的检测算法。但是,由于基础因子图包含循环,因此总和算法(SPA)产生了次优算法。在本文中,我们制定并评估有效的策略,以通过神经增强来提高基于因子图的符号检测的性能。特别是,我们将因子节点的神经信念传播和概括是减轻因子图内周期效应的有效方法。通过将通用预处理器应用于通道输出,我们提出了一种简单的技术来改变每个SPA迭代中的基本因子图。使用这种动态因子图跃迁,我们打算保留水疗消息的外在性质,否则由于周期而受到损害。仿真结果表明,所提出的方法可以大大改善检测性能,甚至可以在各种传输方案中接近最大后验性能,同时保留在块长度和通道内存中线性线性的复杂性。
translated by 谷歌翻译
已知非线性模型可在经常在非理想条件下运行的现实世界应用中提供出色的性能。但是,这些应用程序通常需要使用有限的计算资源进行在线处理。为了解决这个问题,我们为在线应用程序提出了一类新的高效非线性模型。所提出的算法基于使用功能链路扩展的线性参数(LIP)非线性过滤器。为了使这类功能链路自适应过滤器(FLAFS)有效,我们建议参数的低复杂性扩展和频域自适应。在该算法家族中,我们还定义了分区的频率域FLAF,其实现特别适合在线非线性建模问题。我们评估和比较频域FLAF与不同的扩展,从而在性能和计算复杂性之间提供了最佳的权衡。实验结果证明,即使在存在不良的非线性条件和计算资源的可用性有限的情况下,也可以将所提出的算法视为用于在线应用的有效解决方案,例如声音回声取消。
translated by 谷歌翻译
在带有频划分双链体(FDD)的常规多用户多用户多输入多输出(MU-MIMO)系统中,尽管高度耦合,但已单独设计了通道采集和预编码器优化过程。本文研究了下行链路MU-MIMO系统的端到端设计,其中包括试点序列,有限的反馈和预编码。为了解决这个问题,我们提出了一个新颖的深度学习(DL)框架,该框架共同优化了用户的反馈信息生成和基础站(BS)的预编码器设计。 MU-MIMO系统中的每个过程都被智能设计的多个深神经网络(DNN)单元所取代。在BS上,神经网络生成试验序列,并帮助用户获得准确的频道状态信息。在每个用户中,频道反馈操作是由单个用户DNN以分布方式进行的。然后,另一个BS DNN从用户那里收集反馈信息,并确定MIMO预编码矩阵。提出了联合培训算法以端到端的方式优化所有DNN单元。此外,还提出了一种可以避免针对可扩展设计的不同网络大小进行重新训练的培训策略。数值结果证明了与经典优化技术和其他常规DNN方案相比,提出的DL框架的有效性。
translated by 谷歌翻译
由于其低复杂性和鲁棒性,机器学习(ML)吸引了对物理层设计问题的巨大研究兴趣,例如信道估计。通道估计通过ML需要在数据集上进行模型训练,该数据集通常包括作为输入和信道数据的接收的导频信号作为输出。在以前的作品中,模型培训主要通过集中式学习(CL)进行,其中整个训练数据集从基站(BS)的用户收集。这种方法引入了数据收集的巨大通信开销。在本文中,为了解决这一挑战,我们提出了一种用于频道估计的联邦学习(FL)框架。我们设计在用户的本地数据集上培训的卷积神经网络(CNN),而不将它们发送到BS。我们为常规和RIS(智能反射表面)开发了基于流的信道估计方案,辅助大规模MIMO(多输入多输出)系统,其中单个CNN为两种情况训练了两个不同的数据集。我们评估噪声和量化模型传输的性能,并表明所提出的方法提供大约16倍的开销比CL,同时保持令人满意的性能接近CL。此外,所提出的架构表现出比最先进的ML的估计误差较低。
translated by 谷歌翻译
Deep neural networks (DNNs) recently emerged as a promising tool for analyzing and solving complex differential equations arising in science and engineering applications. Alternative to traditional numerical schemes, learning-based solvers utilize the representation power of DNNs to approximate the input-output relations in an automated manner. However, the lack of physics-in-the-loop often makes it difficult to construct a neural network solver that simultaneously achieves high accuracy, low computational burden, and interpretability. In this work, focusing on a class of evolutionary PDEs characterized by having decomposable operators, we show that the classical ``operator splitting'' numerical scheme of solving these equations can be exploited to design neural network architectures. This gives rise to a learning-based PDE solver, which we name Deep Operator-Splitting Network (DOSnet). Such non-black-box network design is constructed from the physical rules and operators governing the underlying dynamics contains learnable parameters, and is thus more flexible than the standard operator splitting scheme. Once trained, it enables the fast solution of the same type of PDEs. To validate the special structure inside DOSnet, we take the linear PDEs as the benchmark and give the mathematical explanation for the weight behavior. Furthermore, to demonstrate the advantages of our new AI-enhanced PDE solver, we train and validate it on several types of operator-decomposable differential equations. We also apply DOSnet to nonlinear Schr\"odinger equations (NLSE) which have important applications in the signal processing for modern optical fiber transmission systems, and experimental results show that our model has better accuracy and lower computational complexity than numerical schemes and the baseline DNNs.
translated by 谷歌翻译
The ever-growing deep learning technologies are making revolutionary changes for modern life. However, conventional computing architectures are designed to process sequential and digital programs, being extremely burdened with performing massive parallel and adaptive deep learning applications. Photonic integrated circuits provide an efficient approach to mitigate bandwidth limitations and power-wall brought by its electronic counterparts, showing great potential in ultrafast and energy-free high-performance computing. Here, we propose an optical computing architecture enabled by on-chip diffraction to implement convolutional acceleration, termed optical convolution unit (OCU). We demonstrate that any real-valued convolution kernels can be exploited by OCU with a prominent computational throughput boosting via the concept of structral re-parameterization. With OCU as the fundamental unit, we build an optical convolutional neural network (oCNN) to implement two popular deep learning tasks: classification and regression. For classification, Fashion-MNIST and CIFAR-4 datasets are tested with accuracy of 91.63% and 86.25%, respectively. For regression, we build an optical denoising convolutional neural network (oDnCNN) to handle Gaussian noise in gray scale images with noise level {\sigma} = 10, 15, 20, resulting clean images with average PSNR of 31.70dB, 29.39dB and 27.72dB, respectively. The proposed OCU presents remarkable performance of low energy consumption and high information density due to its fully passive nature and compact footprint, providing a highly parallel while lightweight solution for future computing architecture to handle high dimensional tensors in deep learning.
translated by 谷歌翻译
为了减轻阴影衰落和障碍物阻塞的影响,可重新配置的智能表面(RIS)已经成为一种有前途的技术,通过控制具有较少硬件成本和更低的功耗来改善无线通信的信号传输质量。然而,由于大量的RIS被动元件,准确,低延迟和低导频和低导架频道状态信息(CSI)采集仍然是RIS辅助系统的相当大挑战。在本文中,我们提出了一个三阶段的关节通道分解和预测框架来要求CSI。所提出的框架利用了基站(BS)-RIS通道是准静态的两次时间段属性,并且RIS用户设备(UE)通道快速时变。具体而言,在第一阶段,我们使用全双工技术来估计BS的特定天线和RIS之间的信道,解决信道分解中的关键缩放模糊问题。然后,我们设计了一种新型的深度神经网络,即稀疏连接的长短期存储器(SCLSTM),并分别在第二和第三阶段提出基于SCLSTM的算法。该算法可以从级联信道同时分解BS-RIS信道和RIS-UE信道,并捕获RIS-UE信道的时间关系以进行预测。仿真结果表明,我们所提出的框架具有比传统信道估计算法更低的导频开销,并且所提出的基于SCLSTM的算法也可以鲁棒地和有效地实现更准确的CSI采集。
translated by 谷歌翻译
Machine learning methods have revolutionized the discovery process of new molecules and materials. However, the intensive training process of neural networks for molecules with ever-increasing complexity has resulted in exponential growth in computation cost, leading to long simulation time and high energy consumption. Photonic chip technology offers an alternative platform for implementing neural networks with faster data processing and lower energy usage compared to digital computers. Photonics technology is naturally capable of implementing complex-valued neural networks at no additional hardware cost. Here, we demonstrate the capability of photonic neural networks for predicting the quantum mechanical properties of molecules. To the best of our knowledge, this work is the first to harness photonic technology for machine learning applications in computational chemistry and molecular sciences, such as drug discovery and materials design. We further show that multiple properties can be learned simultaneously in a photonic chip via a multi-task regression learning algorithm, which is also the first of its kind as well, as most previous works focus on implementing a network in the classification task.
translated by 谷歌翻译