气候变化对作物相关的疑虑构成了新的挑战,包括粮食不安全,供应稳定和经济规划。作为中央挑战之一,作物产量预测已成为机器学习领域的按压任务。尽管重要的是,预测任务是特别的复杂性,因为作物产量取决于天气,陆地,土壤质量等各种因素,以及它们的相互作用。近年来,在该域中成功应用了机器学习模型。然而,这些模型要么将他们的任务限制为相对较小的区域,或者只在单个或几年内进行研究,这使得它们难以在空间和时间上概括。在本文中,我们介绍了一种用于作物产量预测的新型图形的复发性神经网络,以纳入模型中的地理和时间知识,进一步提升预测力。我们的方法是在美国大陆的41个州的2000年历史上进行培训,验证和测试,从1981年到2019年覆盖了几年。据我们所知,这是第一种机器学习方法,可在作物产量预测中嵌入地理知识预测全国县级的作物产量。我们还通过应用众所周知的线性模型,基于树的模型,深度学习方法以及比较它们的性能来对与其他机器学习基线进行稳固的基础。实验表明,我们的提出方法始终如一地优于各种指标上现有的现有方法,验证地理空间和时间信息的有效性。
translated by 谷歌翻译
在收获前的作物产量的准确预测对于世界各地的作物物流,市场计划和食物分配至关重要。产量预测需要在延长的时间段内监测物候和气候特征,以模拟农作物发育中涉及的复杂关系。绕过世界各种卫星提供的遥感卫星图像是获取数据预测数据的廉价且可靠的方法。目前,收益率预测的领域由深度学习方法主导。尽管使用这些方法达到的精度是有希望的,但所需的数据量和``Black-Box''性质可以限制深度学习方法的应用。可以通过提出一条管道将遥感图像处理为基于特征的表示形式来克服局限性,该图像允许使用极端梯度提升(XGBoost)进行产量预测。与基于深度学习的最先进的收益率预测系统相比,对美国大豆产量预测的比较评估显示出了有希望的预测准确性。特征重要性将近红外光谱视为我们模型中的重要特征。报告的结果暗示了XGBoost进行产量预测的能力,并鼓励将来对XGBoost进行XGBoost的实验,以对世界各地的其他农作物进行产量预测。
translated by 谷歌翻译
人口级社会事件,如民事骚乱和犯罪,往往对我们的日常生活产生重大影响。预测此类事件对于决策和资源分配非常重要。由于缺乏关于事件发生的真实原因和潜在机制的知识,事件预测传统上具有挑战性。近年来,由于两个主要原因,研究事件预测研究取得了重大进展:(1)机器学习和深度学习算法的开发和(2)社交媒体,新闻来源,博客,经济等公共数据的可访问性指标和其他元数据源。软件/硬件技术中的数据的爆炸性增长导致了社会事件研究中的深度学习技巧的应用。本文致力于提供社会事件预测的深层学习技术的系统和全面概述。我们专注于两个社会事件的域名:\ Texit {Civil unrest}和\ texit {犯罪}。我们首先介绍事件预测问题如何作为机器学习预测任务制定。然后,我们总结了这些问题的数据资源,传统方法和最近的深度学习模型的发展。最后,我们讨论了社会事件预测中的挑战,并提出了一些有希望的未来研究方向。
translated by 谷歌翻译
在极端分辨率上监测植被生产力对于现实世界中的农业应用非常有价值,例如检测作物压力和提供粮食不安全的预警。太阳能诱导的叶绿素荧光(SIF)提供了一种直接从空间中测量植物生产力的有希望的方法。但是,卫星SIF观察只能以粗空间分辨率进行,因此无法监视单个农作物类型或农场的表现。这构成了一个具有挑战性的粗略监督回归(或缩小)任务;在训练时,我们只有粗分辨率(3公里)的SIF标签,但我们希望以更精细的空间分辨率预测SIF(例如30m,增加了100倍)。我们还具有其他精细分辨率输入功能,但是这些功能与SIF之间的关系尚不清楚。为了解决这个问题,我们提出了一种粗糙的平滑U-NET(CS-Sunet),这是这种粗糙监督设置的新方法。 CS-Sunet基于先验知识(例如平滑度损失),将深卷卷网络的表达能力与新颖的正则化方法相结合,这对于防止过度拟合至关重要。实验表明,CS-Sunet比现有方法更准确地解决SIF中的细粒变化。
translated by 谷歌翻译
Machine learning methods have seen increased application to geospatial environmental problems, such as precipitation nowcasting, haze forecasting, and crop yield prediction. However, many of the machine learning methods applied to mosquito population and disease forecasting do not inherently take into account the underlying spatial structure of the given data. In our work, we apply a spatially aware graph neural network model consisting of GraphSAGE layers to forecast the presence of West Nile virus in Illinois, to aid mosquito surveillance and abatement efforts within the state. More generally, we show that graph neural networks applied to irregularly sampled geospatial data can exceed the performance of a range of baseline methods including logistic regression, XGBoost, and fully-connected neural networks.
translated by 谷歌翻译
在各种下游机器学习任务中,多元时间序列的可靠和有效表示至关重要。在多元时间序列预测中,每个变量都取决于其历史值,并且变量之间也存在相互依存关系。必须设计模型以捕获时间序列之间的内部和相互关系。为了朝着这一目标迈进,我们提出了时间序列注意变压器(TSAT),以进行多元时间序列表示学习。使用TSAT,我们以边缘增强动态图来表示多元时间序列的时间信息和相互依赖性。在动态图中的节点表示,串行中的相关性表示。修改了一种自我注意力的机制,以使用超经验模式分解(SMD)模块捕获序列间的相关性。我们将嵌入式动态图应用于时代序列预测问题,包括两个现实世界数据集和两个基准数据集。广泛的实验表明,TSAT显然在各种预测范围内使用六种最先进的基线方法。我们进一步可视化嵌入式动态图,以说明TSAT的图形表示功能。我们在https://github.com/radiantresearch/tsat上共享代码。
translated by 谷歌翻译
The stock market prediction has been a traditional yet complex problem researched within diverse research areas and application domains due to its non-linear, highly volatile and complex nature. Existing surveys on stock market prediction often focus on traditional machine learning methods instead of deep learning methods. Deep learning has dominated many domains, gained much success and popularity in recent years in stock market prediction. This motivates us to provide a structured and comprehensive overview of the research on stock market prediction focusing on deep learning techniques. We present four elaborated subtasks of stock market prediction and propose a novel taxonomy to summarize the state-of-the-art models based on deep neural networks from 2011 to 2022. In addition, we also provide detailed statistics on the datasets and evaluation metrics commonly used in the stock market. Finally, we highlight some open issues and point out several future directions by sharing some new perspectives on stock market prediction.
translated by 谷歌翻译
Wind power forecasting helps with the planning for the power systems by contributing to having a higher level of certainty in decision-making. Due to the randomness inherent to meteorological events (e.g., wind speeds), making highly accurate long-term predictions for wind power can be extremely difficult. One approach to remedy this challenge is to utilize weather information from multiple points across a geographical grid to obtain a holistic view of the wind patterns, along with temporal information from the previous power outputs of the wind farms. Our proposed CNN-RNN architecture combines convolutional neural networks (CNNs) and recurrent neural networks (RNNs) to extract spatial and temporal information from multi-dimensional input data to make day-ahead predictions. In this regard, our method incorporates an ultra-wide learning view, combining data from multiple numerical weather prediction models, wind farms, and geographical locations. Additionally, we experiment with global forecasting approaches to understand the impact of training the same model over the datasets obtained from multiple different wind farms, and we employ a method where spatial information extracted from convolutional layers is passed to a tree ensemble (e.g., Light Gradient Boosting Machine (LGBM)) instead of fully connected layers. The results show that our proposed CNN-RNN architecture outperforms other models such as LGBM, Extra Tree regressor and linear regression when trained globally, but fails to replicate such performance when trained individually on each farm. We also observe that passing the spatial information from CNN to LGBM improves its performance, providing further evidence of CNN's spatial feature extraction capabilities.
translated by 谷歌翻译
对于电网操作,具有精细时间和空间分辨率的太阳能发电准确预测对于电网的操作至关重要。然而,与数值天气预报(NWP)结合机器学习的最先进方法具有粗略分辨率。在本文中,我们采用曲线图信号处理透视和型号的多网站光伏(PV)生产时间序列作为图表上的信号,以捕获它们的时空依赖性并实现更高的空间和时间分辨率预测。我们提出了两种新颖的图形神经网络模型,用于确定性多站点PV预测,被称为图形 - 卷积的长期内存(GCLSTM)和图形 - 卷积变压器(GCTRAFO)模型。这些方法仅依赖于生产数据并利用PV系统提供密集的虚拟气象站网络的直觉。所提出的方法是在整整一年的两组数据集中评估:1)来自304个真实光伏系统的生产数据,以及2)模拟生产1000个PV系统,包括瑞士分布。该拟议的模型优于最先进的多站点预测方法,用于预测前方6小时的预测视野。此外,所提出的模型以NWP优于最先进的单站点方法,如前方的视野上的输入。
translated by 谷歌翻译
As ride-hailing services become increasingly popular, being able to accurately predict demand for such services can help operators efficiently allocate drivers to customers, and reduce idle time, improve congestion, and enhance the passenger experience. This paper proposes UberNet, a deep learning Convolutional Neural Network for short-term prediction of demand for ride-hailing services. UberNet empploys a multivariate framework that utilises a number of temporal and spatial features that have been found in the literature to explain demand for ride-hailing services. The proposed model includes two sub-networks that aim to encode the source series of various features and decode the predicting series, respectively. To assess the performance and effectiveness of UberNet, we use 9 months of Uber pickup data in 2014 and 28 spatial and temporal features from New York City. By comparing the performance of UberNet with several other approaches, we show that the prediction quality of the model is highly competitive. Further, Ubernet's prediction performance is better when using economic, social and built environment features. This suggests that Ubernet is more naturally suited to including complex motivators in making real-time passenger demand predictions for ride-hailing services.
translated by 谷歌翻译
交通预测是智能交通系统的问题(ITS),并为个人和公共机构是至关重要的。因此,研究高度重视应对准确预报交通系统的复杂的时空相关性。但是,有两个挑战:1)大多数流量预测研究主要集中在造型相邻传感器的相关性,而忽略远程传感器,例如,商务区有类似的时空模式的相关性; 2)使用静态邻接矩阵中曲线图的卷积网络(GCNs)的现有方法不足以反映在交通系统中的动态空间依赖性。此外,它采用自注意所有的传感器模型动态关联细粒度方法忽略道路网络分层信息,并有二次计算复杂性。在本文中,我们提出了一种新动态多图形卷积递归网络(DMGCRN),以解决上述问题,可以同时距离的空间相关性,结构的空间相关性,和所述时间相关性进行建模。那么,只使用基于距离的曲线图来捕获空间信息从节点是接近距离也构建了一个新潜曲线图,其编码的道路之间的相关性的结构来捕获空间信息从节点在结构上相似。此外,我们在不同的时间将每个传感器的邻居到粗粒区域,并且动态地分配不同的权重的每个区域。同时,我们整合动态多图卷积网络到门控重复单元(GRU)来捕获时间依赖性。三个真实世界的交通数据集大量的实验证明,我们提出的算法优于国家的最先进的基线。
translated by 谷歌翻译
多变量时间序列预测,分析历史时序序列以预测未来趋势,可以有效地帮助决策。 MTS中变量之间的复杂关系,包括静态,动态,可预测和潜在的关系,使得可以挖掘MTS的更多功能。建模复杂关系不仅是表征潜在依赖性的必要条件以及建模时间依赖性,而且在MTS预测任务中也带来了极大的挑战。然而,现有方法主要关注模拟MTS变量之间的某些关系。在本文中,我们提出了一种新的端到端深度学习模型,通过异构图形神经网络(MTHETGNN)称为多变量时间序列预测。为了表征变量之间的复杂关系,在MTHETGNN中设计了一个关系嵌入模块,其中每个变量被视为图形节点,并且每种类型的边缘表示特定的静态或动态关系。同时,引入了时间嵌入模块的时间序列特征提取,其中涉及具有不同感知尺度的卷积神经网络(CNN)滤波器。最后,采用异质图形嵌入模块来处理由两个模块产生的复杂结构信息。来自现实世界的三个基准数据集用于评估所提出的MTHETGNN。综合实验表明,MTHETGNN在MTS预测任务中实现了最先进的结果。
translated by 谷歌翻译
对联合国可持续发展目标的进展(SDGS)因关键环境和社会经济指标缺乏数据而受到阻碍,其中历史上有稀疏时间和空间覆盖率的地面调查。机器学习的最新进展使得可以利用丰富,频繁更新和全球可用的数据,例如卫星或社交媒体,以向SDGS提供洞察力。尽管有希望的早期结果,但到目前为止使用此类SDG测量数据的方法在很大程度上在不同的数据集或使用不一致的评估指标上进行了评估,使得难以理解的性能是改善,并且额外研究将是最丰富的。此外,处理卫星和地面调查数据需要域知识,其中许多机器学习群落缺乏。在本文中,我们介绍了3个SDG的3个基准任务的集合,包括与经济发展,农业,健康,教育,水和卫生,气候行动和陆地生命相关的任务。 15个任务中的11个数据集首次公开发布。我们为Acceptandbench的目标是(1)降低机器学习界的进入的障碍,以促进衡量和实现SDGS; (2)提供标准基准,用于评估各种SDG的任务的机器学习模型; (3)鼓励开发新颖的机器学习方法,改进的模型性能促进了对SDG的进展。
translated by 谷歌翻译
We introduce a machine-learning (ML)-based weather simulator--called "GraphCast"--which outperforms the most accurate deterministic operational medium-range weather forecasting system in the world, as well as all previous ML baselines. GraphCast is an autoregressive model, based on graph neural networks and a novel high-resolution multi-scale mesh representation, which we trained on historical weather data from the European Centre for Medium-Range Weather Forecasts (ECMWF)'s ERA5 reanalysis archive. It can make 10-day forecasts, at 6-hour time intervals, of five surface variables and six atmospheric variables, each at 37 vertical pressure levels, on a 0.25-degree latitude-longitude grid, which corresponds to roughly 25 x 25 kilometer resolution at the equator. Our results show GraphCast is more accurate than ECMWF's deterministic operational forecasting system, HRES, on 90.0% of the 2760 variable and lead time combinations we evaluated. GraphCast also outperforms the most accurate previous ML-based weather forecasting model on 99.2% of the 252 targets it reported. GraphCast can generate a 10-day forecast (35 gigabytes of data) in under 60 seconds on Cloud TPU v4 hardware. Unlike traditional forecasting methods, ML-based forecasting scales well with data: by training on bigger, higher quality, and more recent data, the skill of the forecasts can improve. Together these results represent a key step forward in complementing and improving weather modeling with ML, open new opportunities for fast, accurate forecasting, and help realize the promise of ML-based simulation in the physical sciences.
translated by 谷歌翻译
天气预报是一项有吸引力的挑战性任务,因为它对人类生活和大气运动的复杂性的影响。在大量历史观察到的时间序列数据的支持下,该任务适用于数据驱动的方法,尤其是深层神经网络。最近,基于图神经网络(GNN)方法在时空预测方面取得了出色的性能。但是,基于规范的GNNS方法仅分别对每个站的气象变量的局部图或整个车站的全局图进行建模,从而缺乏不同站点的气象变量之间的信息相互作用。在本文中,我们提出了一种新型的层次时空图形神经网络(Histgnn),以模拟多个站点气象变量之间的跨区域时空相关性。自适应图学习层和空间图卷积用于构建自学习图,并研究可变级别和站点级别图的节点之间的隐藏依赖性。为了捕获时间模式,扩张的成立为GATE时间卷积的主干旨在对长而各种气象趋势进行建模。此外,提出了动态的交互学习来构建在层次图中传递的双向信息。三个现实世界中的气象数据集的实验结果表明,史基元超过7个基准的卓越性能,并且将误差降低了4.2%至11.6%,尤其是与最先进的天气预测方法相比。
translated by 谷歌翻译
Climate change, population growth, and water scarcity present unprecedented challenges for agriculture. This project aims to forecast soil moisture using domain knowledge and machine learning for crop management decisions that enable sustainable farming. Traditional methods for predicting hydrological response features require significant computational time and expertise. Recent work has implemented machine learning models as a tool for forecasting hydrological response features, but these models neglect a crucial component of traditional hydrological modeling that spatially close units can have vastly different hydrological responses. In traditional hydrological modeling, units with similar hydrological properties are grouped together and share model parameters regardless of their spatial proximity. Inspired by this domain knowledge, we have constructed a novel domain-inspired temporal graph convolution neural network. Our approach involves clustering units based on time-varying hydrological properties, constructing graph topologies for each cluster, and forecasting soil moisture using graph convolutions and a gated recurrent neural network. We have trained, validated, and tested our method on field-scale time series data consisting of approximately 99,000 hydrological response units spanning 40 years in a case study in northeastern United States. Comparison with existing models illustrates the effectiveness of using domain-inspired clustering with time series graph neural networks. The framework is being deployed as part of a pro bono social impact program. The trained models are being deployed on small-holding farms in central Texas.
translated by 谷歌翻译
Modeling multivariate time series has long been a subject that has attracted researchers from a diverse range of fields including economics, finance, and traffic. A basic assumption behind multivariate time series forecasting is that its variables depend on one another but, upon looking closely, it's fair to say that existing methods fail to fully exploit latent spatial dependencies between pairs of variables. In recent years, meanwhile, graph neural networks (GNNs) have shown high capability in handling relational dependencies. GNNs require well-defined graph structures for information propagation which means they cannot be applied directly for multivariate time series where the dependencies are not known in advance. In this paper, we propose a general graph neural network framework designed specifically for multivariate time series data. Our approach automatically extracts the uni-directed relations among variables through a graph learning module, into which external knowledge like variable attributes can be easily integrated. A novel mix-hop propagation layer and a dilated inception layer are further proposed to capture the spatial and temporal dependencies within the time series. The graph learning, graph convolution, and temporal convolution modules are jointly learned in an end-to-end framework. Experimental results show that our proposed model outperforms the state-of-the-art baseline methods on 3 of 4 benchmark datasets and achieves on-par performance with other approaches on two traffic datasets which provide extra structural information. CCS CONCEPTS• Computing methodologies → Neural networks; Artificial intelligence.
translated by 谷歌翻译
Deep learning has revolutionized many machine learning tasks in recent years, ranging from image classification and video processing to speech recognition and natural language understanding. The data in these tasks are typically represented in the Euclidean space. However, there is an increasing number of applications where data are generated from non-Euclidean domains and are represented as graphs with complex relationships and interdependency between objects. The complexity of graph data has imposed significant challenges on existing machine learning algorithms. Recently, many studies on extending deep learning approaches for graph data have emerged. In this survey, we provide a comprehensive overview of graph neural networks (GNNs) in data mining and machine learning fields. We propose a new taxonomy to divide the state-of-the-art graph neural networks into four categories, namely recurrent graph neural networks, convolutional graph neural networks, graph autoencoders, and spatial-temporal graph neural networks. We further discuss the applications of graph neural networks across various domains and summarize the open source codes, benchmark data sets, and model evaluation of graph neural networks. Finally, we propose potential research directions in this rapidly growing field.
translated by 谷歌翻译
预测抗流动过程中感染的数量对政府制定抗流动策略极为有益,尤其是在细粒度的地理单位中。以前的工作着重于低空间分辨率预测,例如县级和预处理数据到同一地理水平,这将失去一些有用的信息。在本文中,我们提出了一个基于两个地理水平的数据,用于社区级别的COVID-19预测,该模型(FGC-COVID)基于数据。我们使用比社区更细粒度的地理水平(CBG)之间的人口流动数据来构建图形,并使用图形神经网络(GNN)构建图形并捕获CBG之间的依赖关系。为了预测,为了预测更细粒度的模式,引入了空间加权聚合模块,以将CBG的嵌入基于其地理隶属关系和空间自相关,将CBG的嵌入到社区水平上。在300天LA COVID-19数据中进行的大量实验表明,我们的模型的表现优于社区级Covid-19预测的现有预测模型。
translated by 谷歌翻译
近年来,图形神经网络(GNN)与复发性神经网络(RNN)的变体相结合,在时空预测任务中达到了最先进的性能。对于流量预测,GNN模型使用道路网络的图形结构来解释链接和节点之间的空间相关性。最近的解决方案要么基于复杂的图形操作或避免预定义的图。本文提出了一种新的序列结构,以使用具有稀疏体系结构的GNN-RNN细胞在多个抽象的抽象上提取时空相关性,以减少训练时间与更复杂的设计相比。通过多个编码器编码相同的输入序列,并随着编码层的增量增加,使网络能够通过多级抽象来学习一般和详细的信息。我们进一步介绍了来自加拿大蒙特利尔的街道细分市场流量数据的新基准数据集。与高速公路不同,城市路段是循环的,其特征是复杂的空间依赖性。与基线方法相比,一小时预测的实验结果和我们的MSLTD街道级段数据集对我们的模型提高了7%以上,同时将计算资源要求提高了一半以上竞争方法。
translated by 谷歌翻译