CNN-based surrogates have become prevalent in scientific applications to replace conventional time-consuming physical approaches. Although these surrogates can yield satisfactory results with significantly lower computation costs over small training datasets, our benchmarking results show that data-loading overhead becomes the major performance bottleneck when training surrogates with large datasets. In practice, surrogates are usually trained with high-resolution scientific data, which can easily reach the terabyte scale. Several state-of-the-art data loaders are proposed to improve the loading throughput in general CNN training; however, they are sub-optimal when applied to the surrogate training. In this work, we propose SOLAR, a surrogate data loader, that can ultimately increase loading throughput during the training. It leverages our three key observations during the benchmarking and contains three novel designs. Specifically, SOLAR first generates a pre-determined shuffled index list and accordingly optimizes the global access order and the buffer eviction scheme to maximize the data reuse and the buffer hit rate. It then proposes a tradeoff between lightweight computational imbalance and heavyweight loading workload imbalance to speed up the overall training. It finally optimizes its data access pattern with HDF5 to achieve a better parallel I/O throughput. Our evaluation with three scientific surrogates and 32 GPUs illustrates that SOLAR can achieve up to 24.4X speedup over PyTorch Data Loader and 3.52X speedup over state-of-the-art data loaders.
translated by 谷歌翻译
源代码对于研究人员重现方法并复制人工智能(AI)论文的结果至关重要。一些组织和研究人员手动收集具有可用源代码的AI论文,以对AI社区做出贡献。但是,手动收集是一项劳动密集型且耗时的任务。为了解决此问题,我们提出了一种方法,可以自动识别具有可用源代码的论文并提取其源代码存储库URL。通过这种方法,我们发现,从2010年到2019年发布的10个最高AI会议的常规论文中有20.5%被确定为具有可用源代码的论文,并且这些源代码存储库中有8.1%不再可访问。我们还创建了XMU NLP Lab ReadMe数据集,这是用于源代码文档研究的标记已读数文件的最大数据集。通过此数据集,我们发现了很多读书文件没有提供的安装说明或使用教程。此外,对AI会议论文的源代码的一般图片进行了大规模的综合统计分析。提出的解决方案还可以超越AI会议论文,以分析来自期刊和会议的其他科学论文,以阐明更多领域。
translated by 谷歌翻译
许多支付平台持有大规模的营销活动,为鼓励用户通过他们的申请进行奖励。为了最大限度地提高投资回报,在两阶段程序中通常会解决激励拨款。在训练响应估计模型以估计用户的移动支付概率(MPP)之后,应用线性编程过程来获得最佳激励分配。然而,由先前偏置分配策略生成的训练集中的大量偏置数据导致偏置估计。此偏差劣化响应模型的性能并误导线性编程过程,显着降低了所产生的分配策略的性能。为了克服这种障碍,我们提出了偏置校正对抗性网络。我们的方法利用了在全随机分配策略下获得的一小集非偏见数据来培训一个无偏的模型,然后使用它来减少对抗性学习的偏差。离线和在线实验结果表明,我们的方法优于最先进的方法,并显着提高了现实世界营销活动中所产生的分配政策的绩效。
translated by 谷歌翻译
本文介绍了端到端的实例分段框架,称为SOIT,该段具有实例感知变压器的段对象。灵感来自Detr〜\ Cite {carion2020end},我们的方法视图实例分段为直接设置预测问题,有效地消除了对ROI裁剪,一对多标签分配等许多手工制作组件的需求,以及非最大抑制( nms)。在SOIT中,通过在全局图像上下文下直接地将多个查询直接理解语义类别,边界框位置和像素 - WISE掩码的一组对象嵌入。类和边界盒可以通过固定长度的向量轻松嵌入。尤其是由一组参数嵌入像素方面的掩模以构建轻量级实例感知变压器。之后,实例感知变压器产生全分辨率掩码,而不涉及基于ROI的任何操作。总的来说,SOIT介绍了一个简单的单级实例分段框架,它是无乐和NMS的。 MS Coco DataSet上的实验结果表明,优于最先进的实例分割显着的优势。此外,在统一查询嵌入中的多个任务的联合学习还可以大大提高检测性能。代码可用于\ url {https://github.com/yuxiaodonghri/soit}。
translated by 谷歌翻译
理解文章需要了解其成分事件。但是,所提到事件的上下文通常缺乏此事件的细节。然后,除了上下文之外,我们还可以在哪里获得更多关于这种特定事件的知识?这项工作定义了事件链接,在事件级别的新自然语言理解任务。事件链接尝试链接事件提及,例如在新闻文章中出现,例如,最合适的维基百科页面。该页面预计将提供有关事件所指的丰富知识。为了标准化对这一新问题的研究,我们的贡献三折。首先,这是社区中的第一个工作,它正式定义事件链接任务。其次,我们为此新任务收集一个数据集。具体而言,我们首先从维基百科自动收集培训设置,然后创建两个评估集:一个来自维基百科域的域,报告域中的性能;另一个来自真实世界新闻域,测试域外的性能。第三,我们提出Evelink,首先是事件连接方法。总体而言,事件链接是一个很大的具有挑战性的任务,需要更多来自社区的努力。数据和代码可在此处提供:https://github.com/cogcomp/event-linking。
translated by 谷歌翻译
动机:生物医学研究人员和临床从业者的常年挑战是随着出版物和医疗票据的快速增长而待的。自然语言处理(NLP)已成为驯服信息超载的有希望的方向。特别是,大型神经语言模型通过预先绘制的文本预测,通过各种NLP应用中的BERT模型的成功示例,便于通过预先绘制的预先来进行学习。然而,用于结束任务的微调此类模型仍然具有挑战性,特别是具有小标记数据集,这些数据集是生物医学NLP的常见。结果:我们对生物医学NLP的微调稳定性进行了系统研究。我们表明FineTuning性能可能对预先预订的设置敏感,尤其是在低资源域中。大型型号有可能获得更好的性能,但越来越多的模型大小也加剧了FineTuning不稳定性。因此,我们对解决微调不稳定的技术进行了全面的探索。我们表明,这些技术可以大大提高低源生物医学NLP应用的微调性能。具体地,冻结下层有助于标准伯特基型号,而完整的衰减对于BERT-LARD和Electra型号更有效。对于低资源文本相似性任务,如生物,重新初始化顶层是最佳策略。总体而言,占星型词汇和预制促进更强大的微调模型。基于这些调查结果,我们在广泛的生物医学NLP应用方面建立了新的技术。可用性和实施​​:为了促进生物医学NLP的进展,我们释放了我们最先进的预订和微调模型:https://aka.ms/blurb。
translated by 谷歌翻译
分布式深度学习(DDL)对于大型深度学习(DL)培训至关重要。同步随机梯度下降(SSGD)1是事实上的DDL优化方法。使用足够大的批量大小对于实现DDL运行时加速至关重要。在大量批量设置中,必须增加学习速率以补偿减少的参数更新数量。然而,大型学习率可能会损害SSGD和培训可以很容易地分歧。最近,已经提出了分散的平行SGD(DPSGD)以改善分布式训练速度。在本文中,我们发现DPSGD不仅具有系统明智的运行时效,而且在大批量设置中对SSGD的显着收敛性有益。基于对DPSGD学习动态的详细分析,我们发现DPSGD引入了额外的横向依赖性噪声,可自动调整有效的学习率以提高收敛。此外,我们理论上表明这种噪音平滑了损失景观,因此允许更大的学习率。我们在18个最先进的DL模型/任务中进行广泛的研究,并证明DPSGD通常会收敛于SSGD在大批批量设置中大的学习速率的情况下融合。我们的发现一致地遍布两个不同的应用领域:计算机视觉(CIFAR10和Imagenet-1K)和自动语音识别(SWB300和SWB2000),以及两种不同类型的神经网络模型:卷积神经网络和长短期内存经常性神经网络。
translated by 谷歌翻译
The lack of efficient segmentation methods and fully-labeled datasets limits the comprehensive assessment of optical coherence tomography angiography (OCTA) microstructures like retinal vessel network (RVN) and foveal avascular zone (FAZ), which are of great value in ophthalmic and systematic diseases evaluation. Here, we introduce an innovative OCTA microstructure segmentation network (OMSN) by combining an encoder-decoder-based architecture with multi-scale skip connections and the split-attention-based residual network ResNeSt, paying specific attention to OCTA microstructural features while facilitating better model convergence and feature representations. The proposed OMSN achieves excellent single/multi-task performances for RVN or/and FAZ segmentation. Especially, the evaluation metrics on multi-task models outperform single-task models on the same dataset. On this basis, a fully annotated retinal OCTA segmentation (FAROS) dataset is constructed semi-automatically, filling the vacancy of a pixel-level fully-labeled OCTA dataset. OMSN multi-task segmentation model retrained with FAROS further certifies its outstanding accuracy for simultaneous RVN and FAZ segmentation.
translated by 谷歌翻译
We propose, Monte Carlo Nonlocal physics-informed neural networks (MC-Nonlocal-PINNs), which is a generalization of MC-fPINNs in \cite{guo2022monte}, for solving general nonlocal models such as integral equations and nonlocal PDEs. Similar as in MC-fPINNs, our MC-Nonlocal-PINNs handle the nonlocal operators in a Monte Carlo way, resulting in a very stable approach for high dimensional problems. We present a variety of test problems, including high dimensional Volterra type integral equations, hypersingular integral equations and nonlocal PDEs, to demonstrate the effectiveness of our approach.
translated by 谷歌翻译
Blind watermarking provides powerful evidence for copyright protection, image authentication, and tampering identification. However, it remains a challenge to design a watermarking model with high imperceptibility and robustness against strong noise attacks. To resolve this issue, we present a framework Combining the Invertible and Non-invertible (CIN) mechanisms. The CIN is composed of the invertible part to achieve high imperceptibility and the non-invertible part to strengthen the robustness against strong noise attacks. For the invertible part, we develop a diffusion and extraction module (DEM) and a fusion and split module (FSM) to embed and extract watermarks symmetrically in an invertible way. For the non-invertible part, we introduce a non-invertible attention-based module (NIAM) and the noise-specific selection module (NSM) to solve the asymmetric extraction under a strong noise attack. Extensive experiments demonstrate that our framework outperforms the current state-of-the-art methods of imperceptibility and robustness significantly. Our framework can achieve an average of 99.99% accuracy and 67.66 dB PSNR under noise-free conditions, while 96.64% and 39.28 dB combined strong noise attacks. The code will be available in https://github.com/rmpku/CIN.
translated by 谷歌翻译