智能论文笔记

Understanding electricity prices beyond the merit order principle using explainable AI

Julius Trebbien , Leonardo Rydin Gorjão , Aaron Praktiknjo , Benjamin Schäfer , Dirk Witthaut

分类：机器学习

2022-12-09

Electricity prices in liberalized markets are determined by the supply and demand for electric power, which are in turn driven by various external influences that vary strongly in time. In perfect competition, the merit order principle describes that dispatchable power plants enter the market in the order of their marginal costs to meet the residual load, i.e. the difference of load and renewable generation. Many market models implement this principle to predict electricity prices but typically require certain assumptions and simplifications. In this article, we present an explainable machine learning model for the prices on the German day-ahead market, which substantially outperforms a benchmark model based on the merit order principle. Our model is designed for the ex-post analysis of prices and thus builds on various external features. Using Shapley Additive exPlanation (SHAP) values, we can disentangle the role of the different features and quantify their importance from empiric data. Load, wind and solar generation are most important, as expected, but wind power appears to affect prices stronger than solar power does. Fuel prices also rank highly and show nontrivial dependencies, including strong interactions with other features revealed by a SHAP interaction analysis. Large generation ramps are correlated with high prices, again with strong feature interactions, due to the limited flexibility of nuclear and lignite plants. Our results further contribute to model development by providing quantitative insights directly from data.

translated by 谷歌翻译

An Interpretable Probabilistic Model for Short-Term Solar Power Forecasting Using Natural Gradient Boosting

Georgios Mitrentsis , Hendrik Lens

分类：机器学习

2021-08-05

PV power forecasting models are predominantly based on machine learning algorithms which do not provide any insight into or explanation about their predictions (black boxes). Therefore, their direct implementation in environments where transparency is required, and the trust associated with their predictions may be questioned. To this end, we propose a two stage probabilistic forecasting framework able to generate highly accurate, reliable, and sharp forecasts yet offering full transparency on both the point forecasts and the prediction intervals (PIs). In the first stage, we exploit natural gradient boosting (NGBoost) for yielding probabilistic forecasts, while in the second stage, we calculate the Shapley additive explanation (SHAP) values in order to fully comprehend why a prediction was made. To highlight the performance and the applicability of the proposed framework, real data from two PV parks located in Southern Germany are employed. Comparative results with two state-of-the-art algorithms, namely Gaussian process and lower upper bound estimation, manifest a significant increase in the point forecast accuracy and in the overall probabilistic performance. Most importantly, a detailed analysis of the model's complex nonlinear relationships and interaction effects between the various features is presented. This allows interpreting the model, identifying some learned physical properties, explaining individual predictions, reducing the computational requirements for the training without jeopardizing the model accuracy, detecting possible bugs, and gaining trust in the model. Finally, we conclude that the model was able to develop complex nonlinear relationships which follow known physical properties as well as human logic and intuition.

translated by 谷歌翻译

Multivariate Probabilistic Forecasting of Intraday Electricity Prices using Normalizing Flows

Eike Cramer , Dirk Witthaut , Alexander Mitsos , Manuel Dahmen

分类：机器学习

2022-05-27

电力在不同的时间范围和法规上在各个市场上进行交易。由于更高的可再生能源渗透，短期交易变得越来越重要。在德国，盘中电价通常以独特的小时模式围绕EPEX现货市场的白天价格波动。这项工作提出了一种概率建模方法，该方法对日前合同的盘中价格差异进行了建模。该模型通过将每天的每日价格间隔的四个15分钟的间隔视为四维的关节分布，从而捕获了新兴的小时模式。使用归一化流量，即结合条件多元密度估计和概率回归的深层生成模型，从而学习了最终的多元价格差异分布。将归一化流程与选择的历史数据，高斯副群和高斯回归模型进行了比较。在不同的模型中，归一化流量最准确地识别趋势，并且预测间隔最窄。值得注意的是，归一化流是唯一识别稀有价格峰的方法。最后，这项工作讨论了不同外部影响因素的影响，并发现个人大多数因素都可以忽略不计。只有价格差异实现的直接历史和所有投入因素的组合才能显着改善预测。

translated by 谷歌翻译

Artificial Intelligence and Design of Experiments for Assessing Security of Electricity Supply: A Review and Strategic Outlook

Jan Priesmann , Justin Münch , Elias Ridha , Thomas Spiegel , Marius Reich , Mario Adam , Lars Nolting , Aaron Praktiknjo

分类：人工智能

2021-12-07

评估能源转型和能源市场自由化对资源充足性的影响是一种越来越重要和苛刻的任务。能量系统的上升复杂性需要足够的能量系统建模方法，从而提高计算要求。此外，随着复杂性，同样调用概率评估和场景分析同样增加不确定性。为了充分和高效地解决这些各种要求，需要来自数据科学领域的新方法来加速当前方法。通过我们的系统文献综述，我们希望缩小三个学科之间的差距（1）电力供应安全性评估，（2）人工智能和（3）实验设计。为此，我们对所选应用领域进行大规模的定量审查，并制作彼此不同学科的合成。在其他发现之外，我们使用基于AI的方法和应用程序的AI方法和应用来确定电力供应模型的复杂安全性的元素，并作为未充分涵盖的应用领域的储存调度和（非）可用性。我们结束了推出了一种新的方法管道，以便在评估电力供应安全评估时充分有效地解决当前和即将到来的挑战。

translated by 谷歌翻译

Model Transparency and Interpretability : Survey and Application to the Insurance Industry

Dimitri Delcaillau , Antoine Ly , Alize Papp , Franck Vermet

分类： (统计)机器学习 | 机器学习

2022-09-01

即使有效，模型的使用也必须伴随着转换数据的各个级别的理解（上游和下游）。因此，需求增加以定义单个数据与算法可以根据其分析可以做出的选择（例如，一种产品或一种促销报价的建议，或代表风险的保险费率）。模型用户必须确保模型不会区分，并且也可以解释其结果。本文介绍了模型解释的重要性，并解决了模型透明度的概念。在保险环境中，它专门说明了如何使用某些工具来强制执行当今可以利用机器学习的精算模型的控制。在一个简单的汽车保险中损失频率估计的示例中，我们展示了一些解释性方法的兴趣，以适应目标受众的解释。

translated by 谷歌翻译

HTML版本

Wholesale Electricity Price Forecasting using Integrated Long-term Recurrent Convolutional Network Model

Vasudharini Sridharan , Mingjian Tuo , Xingpeng Li

分类：机器学习

2021-12-23

电价是影响所有市场参与者决策的关键因素。准确的电价预测非常重要，并且由于各种因素，电价高度挥发性，电价也非常具有挑战性。本文提出了一项综合的长期经常性卷积网络（ILRCN）模型，以预测考虑到市场价格的大多数贡献属性的电力价格。所提出的ILRCN模型将卷积神经网络和长短期记忆（LSTM）算法的功能与所提出的新颖的条件纠错项相结合。组合的ILRCN模型可以识别输入数据内的线性和非线性行为。我们使用鄂尔顿批发市场价格数据以及负载型材，温度和其他因素来说明所提出的模型。使用平均绝对误差和准确性等性能/评估度量来验证所提出的ILRCN电价预测模型的性能。案例研究表明，与支持向量机（SVM）模型，完全连接的神经网络模型，LSTM模型和LRCN模型，所提出的ILRCN模型在电价预测中是准确和有效的电力价格预测。

translated by 谷歌翻译

A Reinforcement Learning Approach for the Continuous Electricity Market of Germany: Trading from the Perspective of a Wind Park Operator

Malte Lehna , Björn Hoppmann , René Heinrich , Christoph Scholz

分类：机器学习

2021-11-26

随着可再生能源的延伸升幅，盘中电市场在交易商和电力公用事业中录得不断增长的普及，以应对能源供应的诱导波动。通过其短途交易地平线和持续的性质，盘中市场提供了调整日前市场的交易决策的能力，或者在短期通知中降低交易风险。通过根据当前预测修改其提供的能力，可再生能源的生产者利用盘中市场降低预测风险。然而，由于电网必须保持稳定，电力仅部分可存储，因此市场动态很复杂。因此，需要在盘区市场中运营的强大和智能交易策略。在这项工作中，我们提出了一种基于深度加强学习（DRL）算法的新型自主交易方法作为可能的解决方案。为此目的，我们将盘区贸易塑造为马尔可夫决策问题（MDP），并采用近端策略优化（PPO）算法作为我们的DRL方法。介绍了一种模拟框架，使得连续盘整价格的分辨率提供一分钟步骤。从风园运营商的角度来看，我们在案例研究中测试我们的框架。我们在普通贸易信息旁边包括价格和风险预测。在2018年德国盘区交易结果的测试场景中，我们能够以至少45.24％的改进优于多个基线，显示DRL算法的优势。但是，我们还讨论了DRL代理的局限性和增强功能，以便在未来的工作中提高性能。

translated by 谷歌翻译

Comparison and Evaluation of Methods for a Predict+Optimize Problem in Renewable Energy

Christoph Bergmeir , Frits de Nijs , Abishek Sriramulu , Mahdi Abolghasemi , Richard Bean , John Betts , Quang Bui , Nam Trong Dinh , Nils Einecke , Rasul Esmaeilbeigi

分类：人工智能

2022-12-21

Algorithms that involve both forecasting and optimization are at the core of solutions to many difficult real-world problems, such as in supply chains (inventory optimization), traffic, and in the transition towards carbon-free energy generation in battery/load/production scheduling in sustainable energy systems. Typically, in these scenarios we want to solve an optimization problem that depends on unknown future values, which therefore need to be forecast. As both forecasting and optimization are difficult problems in their own right, relatively few research has been done in this area. This paper presents the findings of the ``IEEE-CIS Technical Challenge on Predict+Optimize for Renewable Energy Scheduling," held in 2021. We present a comparison and evaluation of the seven highest-ranked solutions in the competition, to provide researchers with a benchmark problem and to establish the state of the art for this benchmark, with the aim to foster and facilitate research in this area. The competition used data from the Monash Microgrid, as well as weather data and energy market data. It then focused on two main challenges: forecasting renewable energy production and demand, and obtaining an optimal schedule for the activities (lectures) and on-site batteries that lead to the lowest cost of energy. The most accurate forecasts were obtained by gradient-boosted tree and random forest models, and optimization was mostly performed using mixed integer linear and quadratic programming. The winning method predicted different scenarios and optimized over all scenarios jointly using a sample average approximation method.

translated by 谷歌翻译

Macroeconomic Predictions using Payments Data and Machine Learning

James T. E. Chapman , Ajit Desai

分类：机器学习 | (统计)机器学习

2022-09-02

预测经济的短期动态 - 对经济代理商决策过程的重要意见 - 经常在线性模型中使用滞后指标。这通常在正常时期就足够了，但在危机期间可能不足。本文旨在证明，在非线性机器学习方法的帮助下，非传统和及时的数据（例如零售和批发付款）可以为决策者提供复杂的模型，以准确地估算几乎实时的关键宏观经济指标。此外，我们提供了一组计量经济学工具，以减轻机器学习模型中的过度拟合和解释性挑战，以提高其政策使用的有效性。我们的模型具有付款数据，非线性方法和量身定制的交叉验证方法，有助于提高宏观经济的启示准确性高达40 \％ - 在COVID-19期间的增长较高。我们观察到，付款数据对经济预测的贡献很小，在低和正常增长期间是线性的。但是，在强年或正增长期间，付款数据的贡献很大，不对称和非线性。

translated by 谷歌翻译

HTML版本

Causal Effect Estimation with Global Probabilistic Forecasting: A Case Study of the Impact of Covid-19 Lockdowns on Energy Demand

Ankitha Nandipura Prasanna , Priscila Grecov , Angela Dieyu Weng , Christoph Bergmeir

分类：机器学习 | 人工智能

2022-09-19

电力行业正在大力实施智能网格技术，以提高可靠性，可用性，安全性和效率。该实施需要技术进步，标准和法规的发展以及测试和计划。智能电网载荷预测和管理对于降低需求波动和改善连接发电机，分销商和零售商的市场机制至关重要。在政策实施或外部干预措施中，有必要分析其对电力需求的影响的不确定性，以使系统对需求的波动更加准确。本文分析了外部干预的不确定性对电力需求的影响。它实现了一种结合概率和全局预测模型的框架，使用深度学习方法来估计干预措施的因果影响分布。通过预测受影响实例的反事实分布结果，然后将其与实际结果进行对比来评估因果效应。我们将COVID-19锁定对能源使用的影响视为评估这种干预对电力需求分布的不均匀影响的案例研究。我们可以证明，在澳大利亚和某些欧洲国家的最初封锁期间，槽通常比峰值更大的下降，而平均值几乎不受影响。

translated by 谷歌翻译

A Hybrid Statistical-Machine Learning Approach for Analysing Online Customer Behavior: An Empirical Study

Saed Alizami , Kasun Bandara , Ali Eshragh , Foaad Iravani

分类：机器学习

2022-12-01

We apply classical statistical methods in conjunction with the state-of-the-art machine learning techniques to develop a hybrid interpretable model to analyse 454,897 online customers' behavior for a particular product category at the largest online retailer in China, that is JD. While most mere machine learning methods are plagued by the lack of interpretability in practice, our novel hybrid approach will address this practical issue by generating explainable output. This analysis involves identifying what features and characteristics have the most significant impact on customers' purchase behavior, thereby enabling us to predict future sales with a high level of accuracy, and identify the most impactful variables. Our results reveal that customers' product choice is insensitive to the promised delivery time, but this factor significantly impacts customers' order quantity. We also show that the effectiveness of various discounting methods depends on the specific product and the discount size. We identify product classes for which certain discounting approaches are more effective and provide recommendations on better use of different discounting tools. Customers' choice behavior across different product classes is mostly driven by price, and to a lesser extent, by customer demographics. The former finding asks for exercising care in deciding when and how much discount should be offered, whereas the latter identifies opportunities for personalized ads and targeted marketing. Further, to curb customers' batch ordering behavior and avoid the undesirable Bullwhip effect, JD should improve its logistics to ensure faster delivery of orders.

translated by 谷歌翻译

Explainable AI for clinical and remote health applications: a survey on tabular and time series data

Flavio Di Martino , Franca Delmastro

分类：机器学习 | 人工智能

2022-09-14

如今，人工智能（AI）已成为临床和远程医疗保健应用程序的基本组成部分，但是最佳性能的AI系统通常太复杂了，无法自我解释。可解释的AI（XAI）技术被定义为揭示系统的预测和决策背后的推理，并且在处理敏感和个人健康数据时，它们变得更加至关重要。值得注意的是，XAI并未在不同的研究领域和数据类型中引起相同的关注，尤其是在医疗保健领域。特别是，许多临床和远程健康应用程序分别基于表格和时间序列数据，而XAI并未在这些数据类型上进行分析，而计算机视觉和自然语言处理（NLP）是参考应用程序。为了提供最适合医疗领域表格和时间序列数据的XAI方法的概述，本文提供了过去5年中文献的审查，说明了生成的解释的类型以及为评估其相关性所提供的努力和质量。具体而言，我们确定临床验证，一致性评估，客观和标准化质量评估以及以人为本的质量评估作为确保最终用户有效解释的关键特征。最后，我们强调了该领域的主要研究挑战以及现有XAI方法的局限性。

translated by 谷歌翻译

A Meta-Analysis of Solar Forecasting Based on Skill Score

Thi Ngoc Nguyen , Felix Müsgens

分类：机器学习

2022-08-22

我们基于技能评分，对确定性太阳预测进行了首次全面的荟萃分析，筛选了Google Scholar的1,447篇论文，并审查了320篇论文的全文以进行数据提取。用多元自适应回归样条模型，部分依赖图和线性回归构建和分析了4,758点的数据库。值得注意的是，分析说明了数据中最重要的非线性关系和交互项。我们量化了对重要变量的预测准确性的影响，例如预测范围，分辨率，气候条件，区域的年度太阳辐照度水平，电力系统大小和容量，预测模型，火车和测试集以及使用不同的技术和投入。通过控制预测之间的关键差异，包括位置变量，可以在全球应用分析的发现。还提供了该领域科学进步的概述。

translated by 谷歌翻译

Choice modelling in the age of machine learning - discussion paper

S. Van Cranenburgh , S. Wang , A. Vij , F. Pereira , J. Walker

分类：机器学习

2021-01-28

自成立以来，选择建模领域一直由理论驱动的建模方法主导。机器学习提供了一种用于建模行为的替代数据驱动方法，越来越越来越欣赏我们的领域。机器学习模型的交叉授粉，技术和实践有助于克服当前理论驱动的建模范式中遇到的问题和限制，例如模型选择的主观劳动密集型搜索过程，无法使用文本和图像数据。然而，尽管使用机器学习的进步来改善选择建模实践的潜在好处，但选择建模领域已经犹豫了拥抱机器学习。本讨论文件旨在巩固用于使用机器学习模型，技术和实践的知识，以获得选择建模，并讨论其潜力。因此，我们希望不仅希望在选择建模中进一步集成机器学习的情况是有益的，而且还可以进一步方便。为此，我们澄清了两个建模范式之间的相似性和差异;我们审查了机器学习选择建模;我们探讨了拥抱机器学习模式和技术的机会领域，以改善我们的实践。要结束本讨论文件，我们提出了一系列的研究问题，必须解决，以更好地了解机器学习如何受益选择建模。

translated by 谷歌翻译

A Concurrent CNN-RNN Approach for Multi-Step Wind Power Forecasting

Syed Kazmi , Berk Gorgulu , Mucahit Cevik , Mustafa Gokce Baydogan

分类：机器学习

2023-01-02

Wind power forecasting helps with the planning for the power systems by contributing to having a higher level of certainty in decision-making. Due to the randomness inherent to meteorological events (e.g., wind speeds), making highly accurate long-term predictions for wind power can be extremely difficult. One approach to remedy this challenge is to utilize weather information from multiple points across a geographical grid to obtain a holistic view of the wind patterns, along with temporal information from the previous power outputs of the wind farms. Our proposed CNN-RNN architecture combines convolutional neural networks (CNNs) and recurrent neural networks (RNNs) to extract spatial and temporal information from multi-dimensional input data to make day-ahead predictions. In this regard, our method incorporates an ultra-wide learning view, combining data from multiple numerical weather prediction models, wind farms, and geographical locations. Additionally, we experiment with global forecasting approaches to understand the impact of training the same model over the datasets obtained from multiple different wind farms, and we employ a method where spatial information extracted from convolutional layers is passed to a tree ensemble (e.g., Light Gradient Boosting Machine (LGBM)) instead of fully connected layers. The results show that our proposed CNN-RNN architecture outperforms other models such as LGBM, Extra Tree regressor and linear regression when trained globally, but fails to replicate such performance when trained individually on each farm. We also observe that passing the spatial information from CNN to LGBM improves its performance, providing further evidence of CNN's spatial feature extraction capabilities.

translated by 谷歌翻译

Explainability Is in the Mind of the Beholder: Establishing the Foundations of Explainable Artificial Intelligence

Kacper Sokol , Peter Flach

分类：人工智能 | 机器学习 | (统计)机器学习

2021-12-29

可解释的人工智能和可解释的机器学习是重要性越来越重要的研究领域。然而，潜在的概念仍然难以捉摸，并且缺乏普遍商定的定义。虽然社会科学最近的灵感已经重新分为人类受助人的需求和期望的工作，但该领域仍然错过了具体的概念化。通过审查人类解释性的哲学和社会基础，我们采取措施来解决这一挑战，然后我们转化为技术领域。特别是，我们仔细审查了算法黑匣子的概念，并通过解释过程确定的理解频谱并扩展了背景知识。这种方法允许我们将可解释性（逻辑）推理定义为在某些背景知识下解释的透明洞察（进入黑匣子）的解释 - 这是一个从事在Admoleis中理解的过程。然后，我们采用这种概念化来重新审视透明度和预测权力之间的争议权差异，以及对安特 - 人穴和后宫后解释者的影响，以及可解释性发挥的公平和问责制。我们还讨论机器学习工作流程的组件，可能需要可解释性，从以人为本的可解释性建立一系列思想，重点介绍声明，对比陈述和解释过程。我们的讨论调整并补充目前的研究，以帮助更好地导航开放问题 - 而不是试图解决任何个人问题 - 从而为实现的地面讨论和解释的人工智能和可解释的机器学习的未来进展奠定了坚实的基础。我们结束了我们的研究结果，重新审视了实现所需的算法透明度水平所需的人以人为本的解释过程。

translated by 谷歌翻译

A Time Series Approach to Explainability for Neural Nets with Applications to Risk-Management and Fraud Detection

Marc Wildi , Branka Hadji Misheva

分类：机器学习

2022-12-06

Artificial intelligence is creating one of the biggest revolution across technology driven application fields. For the finance sector, it offers many opportunities for significant market innovation and yet broad adoption of AI systems heavily relies on our trust in their outputs. Trust in technology is enabled by understanding the rationale behind the predictions made. To this end, the concept of eXplainable AI emerged introducing a suite of techniques attempting to explain to users how complex models arrived at a certain decision. For cross-sectional data classical XAI approaches can lead to valuable insights about the models' inner workings, but these techniques generally cannot cope well with longitudinal data (time series) in the presence of dependence structure and non-stationarity. We here propose a novel XAI technique for deep learning methods which preserves and exploits the natural time ordering of the data.

translated by 谷歌翻译

A Generic Methodology for the Statistically Uniform & Comparable Evaluation of Automated Trading Platform Components

Artur Sokolovsky , Luca Arnaboldi

分类：机器学习

2020-09-21

尽管机器学习方法已在金融领域广泛使用，但在非常成功的学位上，这些方法仍然可以根据解释性，可比性和可重复性来定制特定研究和不透明。这项研究的主要目的是通过提供一种通用方法来阐明这一领域，该方法是调查 - 不合Snostic且可解释给金融市场从业人员，从而提高了其效率，降低了进入的障碍，并提高了实验的可重复性。提出的方法在两个自动交易平台组件上展示。也就是说，价格水平，众所周知的交易模式和一种新颖的2步特征提取方法。该方法依赖于假设检验，该假设检验在其他社会和科学学科中广泛应用，以有效地评估除简单分类准确性之外的具体结果。提出的主要假设是为了评估所选的交易模式是否适合在机器学习设置中使用。在整个实验中，我们发现在机器学习设置中使用所考虑的交易模式仅由统计数据得到部分支持，从而导致效果尺寸微不足道（反弹7- $ 0.64 \ pm 1.02 $，反弹11 $ 0.38 \ pm 0.98 $，并且篮板15- $ 1.05 \ pm 1.16 $），但允许拒绝零假设。我们展示了美国期货市场工具上的通用方法，并提供了证据表明，通过这种方法，我们可以轻松获得除传统绩效和盈利度指标之外的信息指标。这项工作是最早将这种严格的统计支持方法应用于金融市场领域的工作之一，我们希望这可能是更多研究的跳板。

translated by 谷歌翻译

Automatic Identification and Classification of Share Buybacks and their Effect on Short-, Mid- and Long-Term Returns

Thilo Reintjes

分类：人工智能 | 机器学习

2022-09-26

本文调查了股票回购，特别是分享回购公告。它解决了如何识别此类公告，股票回购的超额回报以及股票回购公告后的回报的预测。我们说明了两种NLP方法，用于自动检测股票回购公告。即使有少量的培训数据，我们也可以达到高达90％的准确性。该论文利用这些NLP方法生成一个由57,155个股票回购公告组成的大数据集。通过分析该数据集，本论文的目的是表明大多数宣布回购的公司的大多数公司都表现不佳。但是，少数公司的表现极大地超过了MSCI世界。当查看所有公司的平均值时，这种重要的表现过高会导致净收益。如果根据公司的规模调整了基准指数，则平均表现过高，并且大多数表现不佳。但是，发现宣布股票回购的公司至少占其市值的1％，即使使用调整后的基准，也平均交付了显着的表现。还发现，在危机时期宣布股票回购的公司比整个市场更好。此外，生成的数据集用于训练72个机器学习模型。通过此，它能够找到许多可以达到高达77％并产生大量超额回报的策略。可以在六个不同的时间范围内改善各种性能指标，并确定明显的表现。这是通过训练多个模型的不同任务和时间范围以及结合这些不同模型的方法来实现的，从而通过融合弱学习者来产生重大改进，以创造一个强大的学习者。

translated by 谷歌翻译

Predicting Swarm Equatorial Plasma Bubbles Via Supervised Machine Learning

S. Reddy , C. Forsyth , A. Aruliah , D. Kataria , A. Smith , J. Bortnik , E. Aa , G. Lewis

分类：机器学习

2022-09-27

赤道等离子体气泡（EPB）是低密度血浆的羽毛，它们从F层的底部升至Exosphere。 EPB是无线电波闪烁的已知原因，可以降低与航天器的通信。我们构建了一个随机的森林回归剂，以预测和预测IBI处理器在船上检测到的EPB [0-1]的可能性。我们使用从2014年到2021年的8年群数据，并将数据从时间序列转换为5维空间，该空间包括纬度，经度，MLT，年份和年度。我们还增加了KP，F10.7厘米和太阳风速。关于地理位置，当地时间，季节和太阳活动的EPB的观察主要与现有工作一致，而链接的地磁活动尚不清楚。该预测的精度为88％，并且在EPB特异性时空尺度上的性能很好。这证明了XGBoost方法能够成功捕获群EPB的气候和每日变异性。由于电离层内的局部和随机特征，捕获每日方差长期以来一直逃避研究人员。我们利用Shapley值来解释该模型并深入了解EPB的物理学。我们发现，随着太阳能速度的增加，EPB的概率降低。我们还确定了EPB概率周围的尖峰。这两个见解直接源自XGBoost和Shapley技术。

translated by 谷歌翻译