我们提供了一个方程/可变的免费机器学习(EVFML)框架,以控制通过基于微观/代理模拟器建模的复杂/多尺度系统的集体动力学。该方法避免了构建替代物,还原级模型的需求。〜所提出的实现包括三个步骤:(a)来自基于高维代理的模拟,机器学习(尤其是非线性歧管学习(扩散图)(扩散地图) (DMS))有助于确定一组粗粒变量,该变量参数化了出现/集体动力学的低维歧管。从高维输入空间到低维歧管和背部,通过将DMS与NyStrom扩展和几何谐波耦合来求解;(b)已确定了歧管及其坐标,我们将方程式的方法利用了方程的方法对出现动力学执行数值分叉分析;然后,基于先前的步骤(C),我们设计了数据驱动的嵌入式洗涤控制器,该控制器将基于代理的模拟器驱动其内在的IM精确知道的,新兴的开环不稳定稳态,因此表明该方案对数值近似误差和建模不确定性是可靠的。交通动态模型和(ii)与哑剧的随机金融市场代理模型的平衡。
translated by 谷歌翻译
Graph Neural Networks (GNNs) achieve state-of-the-art performance on graph-structured data across numerous domains. Their underlying ability to represent nodes as summaries of their vicinities has proven effective for homophilous graphs in particular, in which same-type nodes tend to connect. On heterophilous graphs, in which different-type nodes are likely connected, GNNs perform less consistently, as neighborhood information might be less representative or even misleading. On the other hand, GNN performance is not inferior on all heterophilous graphs, and there is a lack of understanding of what other graph properties affect GNN performance. In this work, we highlight the limitations of the widely used homophily ratio and the recent Cross-Class Neighborhood Similarity (CCNS) metric in estimating GNN performance. To overcome these limitations, we introduce 2-hop Neighbor Class Similarity (2NCS), a new quantitative graph structural property that correlates with GNN performance more strongly and consistently than alternative metrics. 2NCS considers two-hop neighborhoods as a theoretically derived consequence of the two-step label propagation process governing GCN's training-inference process. Experiments on one synthetic and eight real-world graph datasets confirm consistent improvements over existing metrics in estimating the accuracy of GCN- and GAT-based architectures on the node classification task.
translated by 谷歌翻译
The proliferation of radical online communities and their violent offshoots has sparked great societal concern. However, the current practice of banning such communities from mainstream platforms has unintended consequences: (I) the further radicalization of their members in fringe platforms where they migrate; and (ii) the spillover of harmful content from fringe back onto mainstream platforms. Here, in a large observational study on two banned subreddits, r/The\_Donald and r/fatpeoplehate, we examine how factors associated with the RECRO radicalization framework relate to users' migration decisions. Specifically, we quantify how these factors affect users' decisions to post on fringe platforms and, for those who do, whether they continue posting on the mainstream platform. Our results show that individual-level factors, those relating to the behavior of users, are associated with the decision to post on the fringe platform. Whereas social-level factors, users' connection with the radical community, only affect the propensity to be coactive on both platforms. Overall, our findings pave the way for evidence-based moderation policies, as the decisions to migrate and remain coactive amplify unintended consequences of community bans.
translated by 谷歌翻译
One of the major challenges in Deep Reinforcement Learning for control is the need for extensive training to learn the policy. Motivated by this, we present the design of the Control-Tutored Deep Q-Networks (CT-DQN) algorithm, a Deep Reinforcement Learning algorithm that leverages a control tutor, i.e., an exogenous control law, to reduce learning time. The tutor can be designed using an approximate model of the system, without any assumption about the knowledge of the system's dynamics. There is no expectation that it will be able to achieve the control objective if used stand-alone. During learning, the tutor occasionally suggests an action, thus partially guiding exploration. We validate our approach on three scenarios from OpenAI Gym: the inverted pendulum, lunar lander, and car racing. We demonstrate that CT-DQN is able to achieve better or equivalent data efficiency with respect to the classic function approximation solutions.
translated by 谷歌翻译
We investigate the sample complexity of learning the optimal arm for multi-task bandit problems. Arms consist of two components: one that is shared across tasks (that we call representation) and one that is task-specific (that we call predictor). The objective is to learn the optimal (representation, predictor)-pair for each task, under the assumption that the optimal representation is common to all tasks. Within this framework, efficient learning algorithms should transfer knowledge across tasks. We consider the best-arm identification problem for a fixed confidence, where, in each round, the learner actively selects both a task, and an arm, and observes the corresponding reward. We derive instance-specific sample complexity lower bounds satisfied by any $(\delta_G,\delta_H)$-PAC algorithm (such an algorithm identifies the best representation with probability at least $1-\delta_G$, and the best predictor for a task with probability at least $1-\delta_H$). We devise an algorithm OSRL-SC whose sample complexity approaches the lower bound, and scales at most as $H(G\log(1/\delta_G)+ X\log(1/\delta_H))$, with $X,G,H$ being, respectively, the number of tasks, representations and predictors. By comparison, this scaling is significantly better than the classical best-arm identification algorithm that scales as $HGX\log(1/\delta)$.
translated by 谷歌翻译
In recent years, there has been a growing interest in the effects of data poisoning attacks on data-driven control methods. Poisoning attacks are well-known to the Machine Learning community, which, however, make use of assumptions, such as cross-sample independence, that in general do not hold for linear dynamical systems. Consequently, these systems require different attack and detection methods than those developed for supervised learning problems in the i.i.d.\ setting. Since most data-driven control algorithms make use of the least-squares estimator, we study how poisoning impacts the least-squares estimate through the lens of statistical testing, and question in what way data poisoning attacks can be detected. We establish under which conditions the set of models compatible with the data includes the true model of the system, and we analyze different poisoning strategies for the attacker. On the basis of the arguments hereby presented, we propose a stealthy data poisoning attack on the least-squares estimator that can escape classical statistical tests, and conclude by showing the efficiency of the proposed attack.
translated by 谷歌翻译
Covid-19在大流行的不同阶段对公众构成了不成比例的心理健康后果。我们使用一种计算方法来捕获引发在线社区对大流行的焦虑的特定方面,并研究这些方面如何随时间变化。首先,我们使用主题分析在R/covid19 \ _support的Reddit帖子样本($ n $ = 86)中确定了九个焦虑(SOA)。然后,我们通过在手动注释的样本($ n $ = 793)上训练Reddit用户的焦虑来自动将SOA标记在较大的年代样本中($ n $ = 6,535)。 9个SOA与最近开发的大流行焦虑测量量表中的项目保持一致。我们观察到,在大流行的前八个月,Reddit用户对健康风险的担忧仍然很高。尽管案件激增稍后发生,但这些担忧却大大减少了。通常,随着大流行的进展,用户的语言披露了SOA的强烈强度。但是,在本研究涵盖的整个期间,人们对心理健康的担忧和未来稳步增长。人们还倾向于使用更强烈的语言来描述心理健康问题,而不是健康风险或死亡问题。我们的结果表明,尽管Covid-19逐渐削弱,但由于适当的对策而逐渐削弱了作为健康威胁,但该在线小组的心理健康状况并不一定会改善。我们的系统为人口健康和流行病学学者奠定了基础,以及时检查引起大流行焦虑的方面。
translated by 谷歌翻译
在线平台面临着保持社区民用和尊重的压力。因此,从Reddit和Facebook等主流平台上有问题的在线社区的横幅通常会受到热情的公共反应。但是,该策略可以导致用户迁移到具有较低适度标准的替代边缘平台,以及在巨魔和骚扰等反社会行为被广泛接受的地方。由于这些社区的用户经常在主流和边缘平台上保留\ ca,反社会行为可能会溢出到主流平台上。我们通过分析来自迁移到边缘平台的三个被禁止社区的70,000美元的用户来研究这一可能的溢出:r/the \ _donald,r/r/gendericalitical和r/incels。使用差异差异设计,我们将\ CA用户与匹配的对应物进行了对比,以估算边缘平台参与用户对Reddit的反社会行为的因果效应。我们的结果表明,参与边缘社区会增加用户对Reddit的毒性(按照视角API的衡量),并参与了类似于被禁止社区的子雷数 - 这通常也违反了平台规范。效果随着时间的流逝和暴露于边缘平台而加剧。简而言之,我们发现通过共同参与从边缘平台到Reddit的反社会行为溢出的证据。
translated by 谷歌翻译
自引入以来,图形注意力网络在图表表示任务中取得了出色的结果。但是,这些网络仅考虑节点之间的成对关系,然后它们无法完全利用许多现实世界数据集中存在的高阶交互。在本文中,我们介绍了细胞注意网络(CANS),这是一种在图表上定义的数据上运行的神经体系结构,将图表示为介绍的细胞复合物的1个骨骼,以捕获高阶相互作用。特别是,我们利用细胞复合物中的下层和上层社区来设计两种独立的掩盖自我发项机制,从而推广了常规的图形注意力策略。罐中使用的方法是层次结构的,并结合了以下步骤:i)从{\ it node demantion}中学习{\ it Edge功能}的提升算法}; ii)一种细胞注意机制,可以在下层和上邻居上找到边缘特征的最佳组合; iii)层次{\ it Edge Pooling}机制,以提取一组紧凑的有意义的功能集。实验结果表明,CAN是一种低复杂性策略,它与基于图的学​​习任务的最新结果相比。
translated by 谷歌翻译
对脑外伤(TBI)患者的准确预后很难为治疗,患者管理和长期护理提供信息至关重要。年龄,运动和学生反应性,缺氧和低血压以及计算机断层扫描(CT)的放射学发现等患者特征已被确定为TBI结果预测的重要变量。 CT是临床实践中选择的急性成像方式,因为其获取速度和广泛的可用性。但是,这种方式主要用于定性和半定量评估,例如马歇尔评分系统,该系统容易受到主观性和人为错误。这项工作探讨了使用最先进的,深度学习的TBI病变分割方法从常规获得的医院入院CT扫描中提取的成像生物标志物的预测能力。我们使用病变体积和相应的病变统计作为扩展TBI结果预测模型的输入。我们将我们提出的功能的预测能力与马歇尔分数进行比较,并与经典的TBI生物标志物配对。我们发现,在预测不利的TBI结果时,自动提取的定量CT功能的性能与Marshall分数相似或更好。利用自动地图集对齐,我们还确定额叶外病变是不良预后的重要指标。我们的工作可能有助于更好地理解TBI,并提供有关如何使用自动化神经影像分析来改善TBI后预测的新见解。
translated by 谷歌翻译