智能论文笔记

Coded Illumination for Improved Lensless Imaging

Yucheng Zheng , M. Salman Asif

分类：计算机视觉

2021-11-25

基于掩模的无透镜相机可以是平坦的，薄型和轻质的，这使得它们适用于具有大表面积和任意形状的计算成像系统的新颖设计。尽管最近在无晶体相机的进展中，由于底层测量系统的不良状态，从透镜相机恢复的图像质量往往差。在本文中，我们建议使用编码照明来提高用无透镜相机重建的图像的质量。在我们的成像模型中，场景/物体被多种编码照明模式照亮，因为无透镜摄像机记录传感器测量。我们设计并测试了许多照明模式，并观察到变速点（和相关的正交）模式提供了最佳的整体性能。我们提出了一种快速和低复杂性的恢复算法，可利用我们系统中的可分离性和块对角线结构。我们提出了仿真结果和硬件实验结果，以证明我们的提出方法可以显着提高重建质量。

translated by 谷歌翻译

Diffractive lensless imaging with optimized Voronoi-Fresnel phase

Qiang Fu , Dong-Ming Yan , Wolfgang Heidrich

分类：计算机视觉

2021-09-28

Lensless cameras are a class of imaging devices that shrink the physical dimensions to the very close vicinity of the image sensor by replacing conventional compound lenses with integrated flat optics and computational algorithms. Here we report a diffractive lensless camera with spatially-coded Voronoi-Fresnel phase to achieve superior image quality. We propose a design principle of maximizing the acquired information in optics to facilitate the computational reconstruction. By introducing an easy-to-optimize Fourier domain metric, Modulation Transfer Function volume (MTFv), which is related to the Strehl ratio, we devise an optimization framework to guide the optimization of the diffractive optical element. The resulting Voronoi-Fresnel phase features an irregular array of quasi-Centroidal Voronoi cells containing a base first-order Fresnel phase function. We demonstrate and verify the imaging performance for photography applications with a prototype Voronoi-Fresnel lensless camera on a 1.6-megapixel image sensor in various illumination conditions. Results show that the proposed design outperforms existing lensless cameras, and could benefit the development of compact imaging systems that work in extreme physical conditions.

translated by 谷歌翻译

Programmable Spectral Filter Arrays using Phase Spatial Light Modulator

Vishwanath Saragadam , Vijay Rengarajan , Ryuichi Tadano , Tuo Zhuang , Hideki Oyaizu , Jun Murayama , Aswin C. Sankaranarayanan

分类：计算机视觉

2021-09-29

Spatially varying spectral modulation can be implemented using a liquid crystal spatial light modulator (SLM) since it provides an array of liquid crystal cells, each of which can be purposed to act as a programmable spectral filter array. However, such an optical setup suffers from strong optical aberrations due to the unintended phase modulation, precluding spectral modulation at high spatial resolutions. In this work, we propose a novel computational approach for the practical implementation of phase SLMs for implementing spatially varying spectral filters. We provide a careful and systematic analysis of the aberrations arising out of phase SLMs for the purposes of spatially varying spectral modulation. The analysis naturally leads us to a set of "good patterns" that minimize the optical aberrations. We then train a deep network that overcomes any residual aberrations, thereby achieving ideal spectral modulation at high spatial resolution. We show a number of unique operating points with our prototype including dynamic spectral filtering, material classification, and single- and multi-image hyperspectral imaging.

translated by 谷歌翻译

Seeing Far in the Dark with Patterned Flash

Zhanghao Sun , Jian Wang , Yicheng Wu , Shree Nayar

分类：计算机视觉

2022-07-25

闪光照明广泛用于在弱光环境下的成像中。然而，照明强度在繁殖距离四边形掉落，这对长距离闪存成像构成了重大挑战。我们提出了一种新的Flash技术，称为“图案闪光灯”，用于长途闪光灯成像。图案闪光灯将光功率浓缩到点阵列中。与传统的均匀闪光灯相比，信号被各地的噪声淹没，图案闪光灯在整个视野的稀疏分布点上提供了更强的信号，以确保这些点处的信号从传感器噪声中脱颖而出。这使后处理能够解决重要的对象和细节。此外，图案闪光灯将纹理投影到场景上，可以将其视为深度感知的结构化光系统。鉴于新型系统，我们使用卷积神经网络开发了联合图像重建和深度估计算法。我们构建硬件原型，并在各种场景上测试提出的闪存技术。实验结果表明，在弱光环境中，我们的图案闪光在长距离的性能明显更好。

translated by 谷歌翻译

PS$^2$F: Polarized Spiral Point Spread Function for Single-Shot 3D Sensing

Bhargav Ghanekar , Vishwanath Saragadam , Dushyant Mehra , Anna-Karin Gustavsson , Aswin Sankaranarayanan , Ashok Veeraraghavan

分类：计算机视觉

2022-07-03

我们提出了一种依赖工程点扩散功能（PSF）的紧凑型快照单眼估计技术。微观超分辨率成像中使用的传统方法，例如双螺旋PSF（DHPSF），不适合比稀疏的一组点光源更复杂的场景。我们使用cram \'er-rao下限（CRLB）显示，将DHPSF的两个叶分开，从而捕获两个单独的图像导致深度精度的急剧增加。用于生成DHPSF的相掩码的独特属性是，将相掩码分为两个半部分，导致两个裂片的空间分离。我们利用该属性建立一个基于紧凑的极化光学设置，在该设置中，我们将两个正交线性极化器放在DHPSF相位掩码的每一半上，然后使用极化敏感的摄像机捕获所得图像。模拟和实验室原型的结果表明，与包括DHPSF和Tetrapod PSF在内的最新设计相比，我们的技术达到了高达50美元的深度误差，而空间分辨率几乎没有损失。

translated by 谷歌翻译

Video Reconstruction from a Single Motion Blurred Image using Learned Dynamic Phase Coding

Erez Yosef , Shay Elmalem , Raja Giryes

分类：计算机视觉

2021-12-28

来自单个运动模糊图像的视频重建是一个具有挑战性的问题，可以增强现有的相机的能力。最近，几种作品使用传统的成像和深度学习解决了这项任务。然而，由于方向模糊和噪声灵敏度，这种纯粹 - 数字方法本质上是有限的。一些作品提出使用非传统图像传感器解决这些限制，然而，这种传感器非常罕见和昂贵。为了使这些限制具有更简单的方法，我们提出了一种用于视频重建的混合光学 - 数字方法，其仅需要对现有光学系统的简单修改。在图像采集期间，在镜头孔径中使用学习的动态相位编码以对运动轨迹进行编码，该运动轨迹用作视频重建过程的先前信息。使用图像到视频卷积神经网络，所提出的计算相机以各种编码运动模糊图像的各种帧速率产生锐帧帧突发。与现有方法相比，我们使用模拟和现实世界的相机原型表现了优势和改进的性能。

translated by 谷歌翻译

Deep Optical Coding Design in Computational Imaging

Henry Arguello , Jorge Bacca , Hasindu Kariyawasam , Edwin Vargas , Miguel Marquez , Ramith Hettiarachchi , Hans Garcia , Kithmini Herath , Udith Haputhanthri , Balpreet Singh Ahluwalia

分类：计算机视觉

2022-06-27

计算光学成像（COI）系统利用其设置中的光学编码元素（CE）在单个或多个快照中编码高维场景，并使用计算算法对其进行解码。 COI系统的性能很大程度上取决于其主要组件的设计：CE模式和用于执行给定任务的计算方法。常规方法依赖于随机模式或分析设计来设置CE的分布。但是，深神经网络（DNNS）的可用数据和算法功能已在CE数据驱动的设计中开辟了新的地平线，该设计共同考虑了光学编码器和计算解码器。具体而言，通过通过完全可区分的图像形成模型对COI测量进行建模，该模型考虑了基于物理的光及其与CES的相互作用，可以在端到端优化定义CE和计算解码器的参数和计算解码器（e2e）方式。此外，通过在同一框架中仅优化CE，可以从纯光学器件中执行推理任务。这项工作调查了CE数据驱动设计的最新进展，并提供了有关如何参数化不同光学元素以将其包括在E2E框架中的指南。由于E2E框架可以通过更改损耗功能和DNN来处理不同的推理应用程序，因此我们提出低级任务，例如光谱成像重建或高级任务，例如使用基于任务的光学光学体系结构来增强隐私的姿势估计，以维护姿势估算。最后，我们说明了使用全镜DNN以光速执行的分类和3D对象识别应用程序。

translated by 谷歌翻译

Foveated Thermal Computational Imaging in the Wild Using All-Silicon Meta-Optics

Vishwanath Saragadam , Zheyi Han , Vivek Boominathan , Luocheng Huang , Shiyu Tan , Johannes E. Fröch , Karl F. Böhringer , Richard G. Baraniuk , Arka Majumdar , Ashok Veeraraghavan

分类：计算机视觉

2022-12-13

Foveated imaging provides a better tradeoff between situational awareness (field of view) and resolution and is critical in long-wavelength infrared regimes because of the size, weight, power, and cost of thermal sensors. We demonstrate computational foveated imaging by exploiting the ability of a meta-optical frontend to discriminate between different polarization states and a computational backend to reconstruct the captured image/video. The frontend is a three-element optic: the first element which we call the "foveal" element is a metalens that focuses s-polarized light at a distance of $f_1$ without affecting the p-polarized light; the second element which we call the "perifoveal" element is another metalens that focuses p-polarized light at a distance of $f_2$ without affecting the s-polarized light. The third element is a freely rotating polarizer that dynamically changes the mixing ratios between the two polarization states. Both the foveal element (focal length = 150mm; diameter = 75mm), and the perifoveal element (focal length = 25mm; diameter = 25mm) were fabricated as polarization-sensitive, all-silicon, meta surfaces resulting in a large-aperture, 1:6 foveal expansion, thermal imaging capability. A computational backend then utilizes a deep image prior to separate the resultant multiplexed image or video into a foveated image consisting of a high-resolution center and a lower-resolution large field of view context. We build a first-of-its-kind prototype system and demonstrate 12 frames per second real-time, thermal, foveated image, and video capture in the wild.

translated by 谷歌翻译

3D Scene Inference from Transient Histograms

Sacha Jungerman , Atul Ingle , Yin Li , Mohit Gupta

分类：计算机视觉

2022-11-09

Time-resolved image sensors that capture light at pico-to-nanosecond timescales were once limited to niche applications but are now rapidly becoming mainstream in consumer devices. We propose low-cost and low-power imaging modalities that capture scene information from minimal time-resolved image sensors with as few as one pixel. The key idea is to flood illuminate large scene patches (or the entire scene) with a pulsed light source and measure the time-resolved reflected light by integrating over the entire illuminated area. The one-dimensional measured temporal waveform, called \emph{transient}, encodes both distances and albedoes at all visible scene points and as such is an aggregate proxy for the scene's 3D geometry. We explore the viability and limitations of the transient waveforms by themselves for recovering scene information, and also when combined with traditional RGB cameras. We show that plane estimation can be performed from a single transient and that using only a few more it is possible to recover a depth map of the whole scene. We also show two proof-of-concept hardware prototypes that demonstrate the feasibility of our approach for compact, mobile, and budget-limited applications.

translated by 谷歌翻译

Single-Pixel Image Reconstruction Based on Block Compressive Sensing and Deep Learning

Stephen L. H. Lau , Edwin K. P. Chong

分类：计算机视觉

2022-07-14

单像素成像（SPI）是一种新型成像技术，其工作原理基于压缩感（CS）理论。在SPI中，数据是通过一系列压缩测量获得的，并重建了相应的图像。通常，重建算法（例如基础追求）依赖于图像中的稀疏性假设。但是，深度学习的最新进展发现了其在重建CS图像中的用途。尽管在模拟中显示出令人鼓舞的结果，但通常不清楚如何在实际的SPI设置中实现这种算法。在本文中，我们证明了对SPI图像的重建以及块压缩感（BCS）的重建。我们还提出了一个基于卷积神经网络的新型重建模型，该模型优于其他竞争性CS重建算法。此外，通过将BCS合并到我们的深度学习模型中，我们能够重建以上图像大小以上的任何大小的图像。此外，我们表明我们的模型能够重建从SPI设置获得的图像，同时接受自然图像进行训练，这可能与SPI图像大不相同。这为CS重建来自各个领域的图像重建的深度学习模型的可行性打开了机会。

translated by 谷歌翻译

LWGNet: Learned Wirtinger Gradients for Fourier Ptychographic Phase Retrieval

Atreyee Saha , Salman S Khan , Sagar Sehrawat , Sanjana S Prabhu , Shanti Bhattacharya , Kaushik Mitra

分类：计算机视觉

2022-08-08

傅立叶Ptychographic显微镜（FPM）是一种成像过程，它通过计算平均值克服了传统的传统显微镜空间带宽产品（SBP）的限制。它利用使用低数值孔径（NA）物镜捕获的多个图像，并通过频域缝线实现高分辨率相成像。现有的FPM重建方法可以广泛地分为两种方法：基于迭代优化的方法，这些方法基于正向成像模型的物理学以及通常采用馈送深度学习框架的数据驱动方法。我们提出了一个混合模型驱动的残留网络，该网络将远期成像系统的知识与深度数据驱动的网络相结合。我们提出的架构LWGNET将传统的电线流优化算法展开为一种新型的神经网络设计，该设计通过复杂的卷积块增强了梯度图像。与其他传统的展开技术不同，LWGNET在PAR上执行时使用的阶段较少，甚至比现有的传统和深度学习技术更好，尤其是对于低成本和低动态范围CMOS传感器。低位深度和低成本传感器的性能提高有可能显着降低FPM成像设置的成本。最后，我们在收集到的实际数据上显示出始终提高的性能。

translated by 谷歌翻译

Single-shot ToF sensing with sub-mm precision using conventional CMOS sensors

Manuel Ballester , Heming Wang , Jiren Li , Oliver Cossairt , Florian Willomitzer

分类：计算机视觉

2022-12-02

We present a novel single-shot interferometric ToF camera targeted for precise 3D measurements of dynamic objects. The camera concept is based on Synthetic Wavelength Interferometry, a technique that allows retrieval of depth maps of objects with optically rough surfaces at submillimeter depth precision. In contrast to conventional ToF cameras, our device uses only off-the-shelf CCD/CMOS detectors and works at their native chip resolution (as of today, theoretically up to 20 Mp and beyond). Moreover, we can obtain a full 3D model of the object in single-shot, meaning that no temporal sequence of exposures or temporal illumination modulation (such as amplitude or frequency modulation) is necessary, which makes our camera robust against object motion. In this paper, we introduce the novel camera concept and show first measurements that demonstrate the capabilities of our system. We present 3D measurements of small (cm-sized) objects with > 2 Mp point cloud resolution (the resolution of our used detector) and up to sub-mm depth precision. We also report a "single-shot 3D video" acquisition and a first single-shot "Non-Line-of-Sight" measurement. Our technique has great potential for high-precision applications with dynamic object movement, e.g., in AR/VR, industrial inspection, medical imaging, and imaging through scattering media like fog or human tissue.

translated by 谷歌翻译

iToF2dToF: A Robust and Flexible Representation for Data-Driven Time-of-Flight Imaging

Felipe Gutierrez-Barragan , Huaijin Chen , Mohit Gupta , Andreas Velten , Jinwei Gu

分类：计算机视觉

2021-03-12

间接飞行时间（ITOF）相机是一个有希望的深度传感技术。然而，它们容易出现由多路径干扰（MPI）和低信噪比（SNR）引起的错误。传统方法，在去噪后，通过估计编码深度的瞬态图像来减轻MPI。最近，在不使用中间瞬态表示的情况下，共同去噪和减轻MPI的数据驱动方法已经成为最先进的。在本文中，我们建议重新审视瞬态代表。使用数据驱动的Priors，我们将其插入/推断ITOF频率并使用它们来估计瞬态图像。给定直接TOF（DTOF）传感器捕获瞬态图像，我们将我们的方法命名为ITOF2DTOF。瞬态表示是灵活的。它可以集成与基于规则的深度感测算法，对低SNR具有强大，并且可以处理实际上出现的模糊场景（例如，镜面MPI，光学串扰）。我们在真正深度传感方案中展示了先前方法上的ITOF2DTOF的好处。

translated by 谷歌翻译

Onsite Non-Line-of-Sight Imaging via Online Calibrations

Zhengqing Pan , Ruiqian Li , Tian Gao , Zi Wang , Ping Liu , Siyuan Shen , Tao Wu , Jingyi Yu , Shiying Li

分类：计算机视觉

2021-12-29

在部署非视线（NLOS）成像系统中，越来越兴趣，以恢复障碍物背后的物体。现有解决方案通常在扫描隐藏对象之前预先校准系统。在封堵器，对象和扫描模式的现场调整需要重新校准。我们提出了一种在线校准技术，直接将所获取的瞬态扫描到LOS和隐藏组件中的所获取的瞬态耦合。我们使用前者直接（RE）在场景/障碍配置，扫描区域和扫描模式的变化时校准系统，而后者通过空间，频率或基于学习的技术恢复后者。我们的技术避免使用辅助校准设备，例如镜子或棋盘，并支持实验室验证和现实世界部署。

translated by 谷歌翻译

All-photon Polarimetric Time-of-Flight Imaging

Seung-Hwan Baek , Felix Heide

分类：计算机视觉

2021-12-17

飞行时间（TOF）传感器提供了一种成像模型加油，包括自主驾驶，机器人和增强现实的激光雷达。传统的TOF成像方法通过将光的脉冲发送到场景中并测量直接从场景表面反射的第一到达光子的TOF而没有任何时间延迟来估计深度。因此，在该第一响应之后的所有光子通常被认为是不需要的噪声。在本文中，我们通过使用第一到达光子的原理来涉及全光子TOF成像方法来结合第一和后退光子的时间 - 极化分析，这具有关于其几何和材料的丰富现场信息。为此，我们提出了一种新的时间 - 偏振反射模型，一种有效的捕获方法和重建方法，其利用由表面和子表面反射反射的光的时间 - 极性变化。所提出的全光子偏振子TOF成像方法允许通过利用系统捕获的所有光子来获取场景的深度，表面法线和材料参数，而传统的TOF成像仅从第一到达光子获得粗糙的深度。我们使用原型验证我们的模拟方法和实验。

translated by 谷歌翻译

MantissaCam: Learning Snapshot High-dynamic-range Imaging with Perceptually-based In-pixel Irradiance Encoding

Haley M. So , Julien N. P. Martel , Piotr Dudek , Gordon Wetzstein

分类：计算机视觉

2021-12-09

在许多计算机视觉应用程序中，对高动态范围（HDR）场景的能力至关重要。然而，传统传感器的动态范围基本上受其井容量的限制，导致明亮场景部件的饱和度。为了克服这种限制，新兴传感器提供了用于编码入射辐照度的像素处理能力。在最有前途的编码方案中，模数包装，其导致计算机拍摄场景由来自包裹的低动态（LDR）传感器图像的辐照法展开算法计算的计算摄影问题。在这里，我们设计了一种基于神经网络的算法，优于先前的辐照度展示方法，更重要的是，我们设计了一种感知的激发灵感的“螳螂”编码方案，从而更有效地将HDR场景包装到LDR传感器中。结合我们的重建框架，Mantissacam在模型快照HDR成像方法中实现了最先进的结果。我们展示了我们在模拟中的效果，并显示了用可编程传感器实现的原型尾涂的初步结果。

translated by 谷歌翻译

Recent Advances on Non-Line-of-Sight Imaging: Conventional Physical Models, Deep Learning, and New Scenes

Ruixu Geng , Yang Hu , Yan Chen

分类：计算机视觉

2021-04-28

作为一种引起巨大关注的新兴技术，通过分析继电器表面上的漫反射来重建隐藏物体的非视线（NLOS）成像，具有广泛的应用前景，在自主驾驶，医学成像和医学成像领域防御。尽管信噪比低（SNR）和高不良效率的挑战，但近年来，NLOS成像已迅速发展。大多数当前的NLOS成像技术使用传统的物理模型，通过主动或被动照明构建成像模型，并使用重建算法来恢复隐藏场景。此外，NLOS成像的深度学习算法最近也得到了很多关注。本文介绍了常规和深度学习的NLOS成像技术的全面概述。此外，我们还调查了新的拟议的NLOS场景，并讨论了现有技术的挑战和前景。这样的调查可以帮助读者概述不同类型的NLOS成像，从而加速了在角落周围看到的发展。

translated by 谷歌翻译

Structured Light with Redundancy Codes

Zhanghao Sun , Yu Zhang , Yicheng Wu , Dong Huo , Yiming Qian , Jian Wang

分类：计算机视觉

2022-06-18

结构光（SL）系统以主动照明投影获得高保真3D几何形状。当在具有强烈的环境照明，全球照明和跨设备干扰的环境中工作时，常规系统会出现挑战。本文提出了一种通用技术，以通过投影除天然SL模式来预测冗余光学信号来提高SL的鲁棒性。这样，预计的信号与错误更具区别。因此，可以使用简单的信号处理更容易地恢复几何信息，并获得``性能中的编码增益''。我们使用冗余代码提出了三个应用程序：（1）在强环境光下进行SL成像的自我错误校正，（（（ 2）在全球照明下自适应重建的错误检测，以及（3）使用设备特定的投影序列编码的干扰过滤，尤其是针对基于事件摄像机的SL和灯窗帘设备。我们系统地分析了这些应用中的设计规则和信号处理算法。相应的硬件原型是用于在现实世界复杂场景上进行评估的。合成和真实数据的实验结果证明了具有冗余代码的SL系统的显着性能改进。

translated by 谷歌翻译

TöRF: Time-of-Flight Radiance Fields for Dynamic Scene View Synthesis

Benjamin Attal , Eliot Laidlaw , Aaron Gokaslan , Changil Kim , Christian Richardt , James Tompkin , Matthew O'Toole

分类：计算机视觉

2021-09-30

神经网络可以表示和准确地重建静态3D场景的辐射场（例如，NERF）。有几种作品将这些功能扩展到用单眼视频捕获的动态场景，具有很有希望的性能。然而，已知单眼设置是一个受限制的问题，因此方法依赖于数据驱动的前导者来重建动态内容。我们用飞行时间（TOF）相机的测量来替换这些前沿，并根据连续波TOF相机的图像形成模型引入神经表示。我们而不是使用加工的深度映射，我们模拟了原始的TOF传感器测量，以改善重建质量，避免低反射区域，多路径干扰和传感器的明确深度范围的问题。我们表明，这种方法改善了动态场景重建对错误校准和大型运动的鲁棒性，并讨论了现在可在现代智能手机上提供的RGB + TOF传感器的好处和限制。

translated by 谷歌翻译

Snapshot Multispectral Imaging Using a Diffractive Optical Network

Deniz Mengu , Anika Tabassum , Mona Jarrahi , Aydogan Ozcan

分类：计算机视觉

2022-12-10

Multispectral imaging has been used for numerous applications in e.g., environmental monitoring, aerospace, defense, and biomedicine. Here, we present a diffractive optical network-based multispectral imaging system trained using deep learning to create a virtual spectral filter array at the output image field-of-view. This diffractive multispectral imager performs spatially-coherent imaging over a large spectrum, and at the same time, routes a pre-determined set of spectral channels onto an array of pixels at the output plane, converting a monochrome focal plane array or image sensor into a multispectral imaging device without any spectral filters or image recovery algorithms. Furthermore, the spectral responsivity of this diffractive multispectral imager is not sensitive to input polarization states. Through numerical simulations, we present different diffractive network designs that achieve snapshot multispectral imaging with 4, 9 and 16 unique spectral bands within the visible spectrum, based on passive spatially-structured diffractive surfaces, with a compact design that axially spans ~72 times the mean wavelength of the spectral band of interest. Moreover, we experimentally demonstrate a diffractive multispectral imager based on a 3D-printed diffractive network that creates at its output image plane a spatially-repeating virtual spectral filter array with 2x2=4 unique bands at terahertz spectrum. Due to their compact form factor and computation-free, power-efficient and polarization-insensitive forward operation, diffractive multispectral imagers can be transformative for various imaging and sensing applications and be used at different parts of the electromagnetic spectrum where high-density and wide-area multispectral pixel arrays are not widely available.

translated by 谷歌翻译