通常,通过比较使用不同算法获得的社区的评估度量值来评估社区检测算法。用于衡量社区质量的评估指标结合了实体的拓扑信息,例如社区内部或外部节点的连接性。但是,在比较度量值的同时,它失去了社区拓扑信息在比较过程中的直接参与。在本文中,提出了一种直接比较方法,直接比较了两种算法获得的社区的拓扑信息。质量度量是基于社区拓扑信息的直接比较而设计的。考虑到新设计的质量度量,开发了两个排名方案。研究了八种广泛使用的现实世界数据集和六种社区检测算法的拟议质量指标以及排名方案的功效。
translated by 谷歌翻译
Community detection in Social Networks is associated with finding and grouping the most similar nodes inherent in the network. These similar nodes are identified by computing tie strength. Stronger ties indicates higher proximity shared by connected node pairs. This work is motivated by Granovetter's argument that suggests that strong ties lies within densely connected nodes and the theory that community cores in real-world networks are densely connected. In this paper, we have introduced a novel method called \emph{Disjoint Community detection using Cascades (DCC)} which demonstrates the effectiveness of a new local density based tie strength measure on detecting communities. Here, tie strength is utilized to decide the paths followed for propagating information. The idea is to crawl through the tuple information of cascades towards the community core guided by increasing tie strength. Considering the cascade generation step, a novel preferential membership method has been developed to assign community labels to unassigned nodes. The efficacy of $DCC$ has been analyzed based on quality and accuracy on several real-world datasets and baseline community detection algorithms.
translated by 谷歌翻译
Online Social Networks have embarked on the importance of connection strength measures which has a broad array of applications such as, analyzing diffusion behaviors, community detection, link predictions, recommender systems. Though there are some existing connection strength measures, the density that a connection shares with it's neighbors and the directionality aspect has not received much attention. In this paper, we have proposed an asymmetric edge similarity measure namely, Neighborhood Density-based Edge Similarity (NDES) which provides a fundamental support to derive the strength of connection. The time complexity of NDES is $O(nk^2)$. An application of NDES for community detection in social network is shown. We have considered a similarity based community detection technique and substituted its similarity measure with NDES. The performance of NDES is evaluated on several small real-world datasets in terms of the effectiveness in detecting communities and compared with three widely used similarity measures. Empirical results show NDES enables detecting comparatively better communities both in terms of accuracy and quality.
translated by 谷歌翻译
Nature-inspired optimization Algorithms (NIOAs) are nowadays a popular choice for community detection in social networks. Community detection problem in social network is treated as optimization problem, where the objective is to either maximize the connection within the community or minimize connections between the communities. To apply NIOAs, either of the two, or both objectives are explored. Since NIOAs mostly exploit randomness in their strategies, it is necessary to analyze their performance for specific applications. In this paper, NIOAs are analyzed on the community detection problem. A direct comparison approach is followed to perform pairwise comparison of NIOAs. The performance is measured in terms of five scores designed based on prasatul matrix and also with average isolability. Three widely used real-world social networks and four NIOAs are considered for analyzing the quality of communities generated by NIOAs.
translated by 谷歌翻译
The performance of individual evolutionary optimization algorithms is mostly measured in terms of statistics such as mean, median and standard deviation etc., computed over the best solutions obtained with few trails of the algorithm. To compare the performance of two algorithms, the values of these statistics are compared instead of comparing the solutions directly. This kind of comparison lacks direct comparison of solutions obtained with different algorithms. For instance, the comparison of best solutions (or worst solution) of two algorithms simply not possible. Moreover, ranking of algorithms is mostly done in terms of solution quality only, despite the fact that the convergence of algorithm is also an important factor. In this paper, a direct comparison approach is proposed to analyze the performance of evolutionary optimization algorithms. A direct comparison matrix called \emph{Prasatul Matrix} is prepared, which accounts direct comparison outcome of best solutions obtained with two algorithms for a specific number of trials. Five different performance measures are designed based on the prasatul matrix to evaluate the performance of algorithms in terms of Optimality and Comparability of solutions. These scores are utilized to develop a score-driven approach for comparing performance of multiple algorithms as well as for ranking both in the grounds of solution quality and convergence analysis. Proposed approach is analyzed with six evolutionary optimization algorithms on 25 benchmark functions. A non-parametric statistical analysis, namely Wilcoxon paired sum-rank test is also performed to verify the outcomes of proposed direct comparison approach.
translated by 谷歌翻译
Information diffusion in Online Social Networks is a new and crucial problem in social network analysis field and requires significant research attention. Efficient diffusion of information are of critical importance in diverse situations such as; pandemic prevention, advertising, marketing etc. Although several mathematical models have been developed till date, but previous works lacked systematic analysis and exploration of the influence of neighborhood for information diffusion. In this paper, we have proposed Common Neighborhood Strategy (CNS) algorithm for information diffusion that demonstrates the role of common neighborhood in information propagation throughout the network. The performance of CNS algorithm is evaluated on several real-world datasets in terms of diffusion speed and diffusion outspread and compared with several widely used information diffusion models. Empirical results show CNS algorithm enables better information diffusion both in terms of diffusion speed and diffusion outspread.
translated by 谷歌翻译
复杂的网络是代表现实生活系统的图形,这些系统表现出独特的特征,这些特征在纯粹的常规或完全随机的图中未发现。由于基础过程的复杂性,对此类系统的研究至关重要,但具有挑战性。然而,由于大量网络数据的可用性,近几十年来,这项任务变得更加容易。复杂网络中的链接预测旨在估计网络中缺少两个节点之间的链接的可能性。由于数据收集的不完美或仅仅是因为它们尚未出现,因此可能会缺少链接。发现网络数据中实体之间的新关系吸引了研究人员在社会学,计算机科学,物理学和生物学等各个领域的关注。大多数现有研究的重点是无向复杂网络中的链接预测。但是,并非所有现实生活中的系统都可以忠实地表示为无向网络。当使用链接预测算法时,通常会做出这种简化的假设,但不可避免地会导致有关节点之间关系和预测性能中降解的信息的丢失。本文介绍了针对有向网络的明确设计的链接预测方法。它基于相似性范式,该范式最近已证明在无向网络中成功。提出的算法通过在相似性和受欢迎程度上将其建模为不对称性来处理节点关系中的不对称性。鉴于观察到的网络拓扑结构,该算法将隐藏的相似性近似为最短路径距离,并使用边缘权重捕获并取消链接的不对称性和节点的受欢迎程度。在现实生活中评估了所提出的方法,实验结果证明了其在预测各种网络数据类型和大小的丢失链接方面的有效性。
translated by 谷歌翻译
隐藏的社区是最近提出的一个有用的概念,用于社交网络分析。为了处理网络规模的快速增长,在这项工作中,我们从本地角度探讨了隐藏社区的检测,并提出了一种在从原始网络采样的子程目上迭代地检测和提升每个层的新方法。我们首先将根据我们修改的本地频谱方法从单个种子节点展开种子集,并检测初始占主导地位的本地社区。然后,我们暂时删除该社区的成员以及它们与其他节点的连接,并检测剩余子图中的所有邻居社区,包括一些“破坏社区”,该部分仅包含原始网络中的一部分成员。当地社区和邻里社区形成了一个主导层,通过减少这些社区内的边缘权重,我们削弱了这一层的结构来揭示隐藏的层。最终,我们重复整个过程,并且可以迭代地检测并升级包含种子节点的所有社区。理论上我们展示了我们的方法可以避免破碎的社区和当地社区被认为是子图中的一个社区的某些情况,导致对全球隐藏的社区检测方法可能引起的检测不准确。广泛的实验表明,我们的方法可以显着优于为全球隐藏社区检测或多个本地社区检测设计的最先进的基线。
translated by 谷歌翻译
Clustering is a fundamental problem in network analysis that finds closely connected groups of nodes and separates them from other nodes in the graph, while link prediction is to predict whether two nodes in a network are likely to have a link. The definition of both naturally determines that clustering must play a positive role in obtaining accurate link prediction tasks. Yet researchers have long ignored or used inappropriate ways to undermine this positive relationship. In this article, We construct a simple but efficient clustering-driven link prediction framework(ClusterLP), with the goal of directly exploiting the cluster structures to obtain connections between nodes as accurately as possible in both undirected graphs and directed graphs. Specifically, we propose that it is easier to establish links between nodes with similar representation vectors and cluster tendencies in undirected graphs, while nodes in a directed graphs can more easily point to nodes similar to their representation vectors and have greater influence in their own cluster. We customized the implementation of ClusterLP for undirected and directed graphs, respectively, and the experimental results using multiple real-world networks on the link prediction task showed that our models is highly competitive with existing baseline models. The code implementation of ClusterLP and baselines we use are available at https://github.com/ZINUX1998/ClusterLP.
translated by 谷歌翻译
能够捕获与特征向量的时间序列的特征是具有多种应用的非常重要的任务,例如分类,聚类或预测。通常,该特征是从线性和非线性时间序列测量获得的特征,其可能存在若干数据相关的缺点。在这项工作中,我们将NetF介绍作为替代特征,包括时间序列的不同复杂网络映射的几种代表性拓扑测量。我们的方法不需要数据预处理,并且无论任何数据特征如何,都适用。探索我们的新颖特征向量,我们能够将映射的网络功能连接到多样化的时间序列模型中固有的属性,显示NetF可以有用的时间数据。此外,我们还展示了我们在聚类合成和基准时间序列组中的方法的适用性,比较其具有更多传统功能的性能,展示了Netf如何实现高精度集群。我们的结果非常有前途,具有来自不同映射方法的网络特征,捕获时间序列的不同属性,将不同且丰富的功能设置为文献。
translated by 谷歌翻译
临床记录经常包括对患者特征的评估,其中可能包括完成各种问卷。这些问卷提供了有关患者当前健康状况的各种观点。捕获这些观点给出的异质性不仅至关重要,而且对开发具有成本效益的技术的临床表型技术的需求增长。填写许多问卷可能是患者的压力,因此昂贵。在这项工作中,我们提出了钴 - 一种基于成本的层选择器模型,用于使用社区检测方法检测表型。我们的目标是最大程度地减少用于构建这些表型的功能的数量,同时保持其质量。我们使用来自慢性耳鸣患者的问卷数据测试我们的模型,并在多层网络结构中代表数据。然后,通过使用基线特征(年龄,性别和治疗前数据)以及确定的表型作为特征来评估该模型。对于某些治疗后变量,使用来自钴的表型作为特征的预测因素优于使用传统聚类方法检测到的表型的预测因素。此外,与仅接受基线特征训练的预测因子相比,使用表型数据预测治疗后数据被证明是有益的。
translated by 谷歌翻译
由能够连接和交换消息的越来越多的移动设备而激励,我们提出了一种旨在模拟和分析网络中节点移动性的方法。我们注意到文献中的许多现有解决方案依赖于直接在节点联系人图表上计算的拓扑测量,旨在捕获节点在有利于原型设计,设计和部署移动网络的连接和移动模式方面的重要性。但是,每个措施都具有其特异性,并且无法概括最终随时间变化的节点重要性概念。与以前的方法不同,我们的方法基于节点嵌入方法,该方法模型和推出在保留其空间和时间特征的同时在移动性和连接模式中对节点的重要性。我们专注于基于一丝小组会议的案例研究。结果表明,我们的方法提供了提取不同移动性和连接模式的丰富表示,这可能有助于移动网络中的各种应用和服务。
translated by 谷歌翻译
Network structure evolves with time in the real world, and the discovery of changing communities in dynamic networks is an important research topic that poses challenging tasks. Most existing methods assume that no significant change in the network occurs; namely, the difference between adjacent snapshots is slight. However, great change exists in the real world usually. The great change in the network will result in the community detection algorithms are difficulty obtaining valuable information from the previous snapshot, leading to negative transfer for the next time steps. This paper focuses on dynamic community detection with substantial changes by integrating higher-order knowledge from the previous snapshots to aid the subsequent snapshots. Moreover, to improve search efficiency, a higher-order knowledge transfer strategy is designed to determine first-order and higher-order knowledge by detecting the similarity of the adjacency matrix of snapshots. In this way, our proposal can better keep the advantages of previous community detection results and transfer them to the next task. We conduct the experiments on four real-world networks, including the networks with great or minor changes. Experimental results in the low-similarity datasets demonstrate that higher-order knowledge is more valuable than first-order knowledge when the network changes significantly and keeps the advantage even if handling the high-similarity datasets. Our proposal can also guide other dynamic optimization problems with great changes.
translated by 谷歌翻译
许多复杂网络的结构包括其拓扑顶部的边缘方向性和权重。可以无缝考虑这些属性组合的网络分析是可取的。在本文中,我们研究了两个重要的这样的网络分析技术,即中心和聚类。采用信息流基于集群的模型,该模型本身就是在计算中心的信息定理措施时构建。我们的主要捐款包括马尔可夫熵中心的广义模型,灵活地调整节点度,边缘权重和方向的重要性,具有闭合形式的渐近分析。它导致一种新颖的两级图形聚类算法。中心分析有助于推理我们对给定图形的方法的适用性,并确定探索当地社区结构的“查询”节点,从而导致群集聚类机制。熵中心计算由我们的聚类算法摊销,使其计算得高效:与使用马尔可夫熵中心为聚类的先前方法相比,我们的实验表明了多个速度的速度。我们的聚类算法自然地继承了适应边缘方向性的灵活性,以及​​边缘权重和节点度之间的不同解释和相互作用。总的来说,本文不仅具有显着的理论和概念贡献,还转化为实际相关性的文物,产生新的,有效和可扩展的中心计算和图形聚类算法,其有效通过广泛的基准测试进行了验证。
translated by 谷歌翻译
社交网络(SN)是一个由代表它们之间相互作用的群体组成的社会结构。 SNS最近被广泛使用,随后已成为产品推广和信息扩散的合适平台。 SN中的人们直接影响彼此的利益和行为。 SNS中最重要的问题之一是,如果选择将它们作为网络扩散场景的种子节点选择,那么他们可以以级联的方式对网络中的其他节点产生最大影响。有影响力的扩散器是人们,如果他们被选为网络中出版问题中的种子,那么该网络将拥有最多了解该扩散实体的人。这是称为影响最大化(IM)问题的文献中的一个众所周知的问题。尽管已证明这是一个NP完整的问题,并且在多项式时间内没有解决方案,但有人认为它具有子模块化功能的属性,因此可以使用贪婪的算法来解决。提出改善这种复杂性的大多数方法都是基于以下假设:整个图都是可见的。但是,此假设不适合许多真实世界图。进行了这项研究,以扩展使用链接预测技术与伪可见性图的电流最大化方法。为此,将一种称为指数随机图模型(ERGM)的图生成方法用于链接预测。使用斯坦福大学SNAP数据集的数据对所提出的方法进行了测试。根据实验测试,所提出的方法在现实世界图上有效。
translated by 谷歌翻译
社区检测是社会网络分析中最重要而有趣的问题之一。近年来,同时考虑社区检测过程中社交网络的节点的属性和拓扑结构,吸引了许多学者的关注,最近在一些社区检测方法中使用了这一考虑,以增加他们的效率并增强他们的效率寻找有意义和相关社区的表演。但问题是,大多数这些方法都倾向于找到非重叠的社区,而许多现实网络包括在某种程度上经常重叠的社区。为了解决这个问题,在本文中提出了一种称为Mobbo-OCD的进化算法,该算法基于基于多目标生物地理学的优化(BBO),以在同步地考虑中自动查找与节点属性的社交网络中的重叠社区网络中的连接密度和节点属性的相似性。在Mobbo-OCD中,引入称为OLAR的扩展基于轨迹的邻接邻接,以编码和解码重叠的社区。基于OLAR,基于秩的迁移操作员以及新的两相突变策略和新的双点交叉在Mobbo-OCD的演化过程中使用,以有效地将人群引导到进化路径中。为了评估mobbo-ocd的性能,本文提出了一种名为Alpha_Saem的新度量,这是考虑节点属性和链接结构的两个方面,可以评估重叠和非重叠分区的良好。量化评估表明,Mobbo-ocd实现了有利的结果,这些结果非常优于文献中的15个相关群落检测算法的结果。
translated by 谷歌翻译
Most real-world networks suffer from incompleteness or incorrectness, which is an inherent attribute to real-world datasets. As a consequence, those downstream machine learning tasks in complex network like community detection methods may yield less satisfactory results, i.e., a proper preprocessing measure is required here. To address this issue, in this paper, we design a new community attribute based link prediction strategy HAP and propose a two-step community enhancement algorithm with automatic evolution process based on HAP. This paper aims at providing a community enhancement measure through adding links to clarify ambiguous community structures. The HAP method takes the neighbourhood uncertainty and Shannon entropy to identify boundary nodes, and establishes links by considering the nodes' community attributes and community size at the same time. The experimental results on twelve real-world datasets with ground truth community indicate that the proposed link prediction method outperforms other baseline methods and the enhancement of community follows the expected evolution process.
translated by 谷歌翻译
旨在识别不同网络中的相应节点的网络对齐任务对许多随后的应用程序具有重要意义。不需要标记的锚点链接,无监督的对准方法吸引了越来越多的关注。但是,由现有方法定义的拓扑一致性假设通常是低阶且准确的,因为仅考虑边缘式拓扑模式,这在无监督的环境中尤其有风险。为了重新定位对齐过程从低阶到高阶拓扑一致性的重点,在本文中,我们提出了一个名为HTC的完全无监督的网络对齐框架。提出的高阶拓扑一致性是基于边缘轨道制定的,将其合并到图形卷积网络的信息聚合过程中,以便将对齐一致性转换为节点嵌入的相似性。此外,编码器经过培训为多轨了解,然后进行完善以识别更受信任的锚点链接。通过整合所有不同的一致性顺序,可以全面评估节点对应关系。 {除了合理的理论分析外,所提出方法的优越性还通过广泛的实验评估得到了经验证明。在三对现实世界数据集和两对合成数据集上,我们的HTC始终以最少或可比的时间消耗优于各种各样的无监督和监督方法。由于我们的多轨道感知训练机制,它还表现出对结构噪声的鲁棒性。
translated by 谷歌翻译
A large number of studies on Graph Outlier Detection (GOD) have emerged in recent years due to its wide applications, in which Unsupervised Node Outlier Detection (UNOD) on attributed networks is an important area. UNOD focuses on detecting two kinds of typical outliers in graphs: the structural outlier and the contextual outlier. Most existing works conduct experiments based on datasets with injected outliers. However, we find that the most widely-used outlier injection approach has a serious data leakage issue. By only utilizing such data leakage, a simple approach can achieve state-of-the-art performance in detecting outliers. In addition, we observe that most existing algorithms have a performance drop with varied injection settings. The other major issue is on balanced detection performance between the two types of outliers, which has not been considered by existing studies. In this paper, we analyze the cause of the data leakage issue in depth since the injection approach is a building block to advance UNOD. Moreover, we devise a novel variance-based model to detect structural outliers, which outperforms existing algorithms significantly at different injection settings. On top of this, we propose a new framework, Variance-based Graph Outlier Detection (VGOD), which combines our variance-based model and attribute reconstruction model to detect outliers in a balanced way. Finally, we conduct extensive experiments to demonstrate the effectiveness and efficiency of VGOD. The results on 5 real-world datasets validate that VGOD achieves not only the best performance in detecting outliers but also a balanced detection performance between structural and contextual outliers. Our code is available at https://github.com/goldenNormal/vgod-github.
translated by 谷歌翻译
Community detection is the task of discovering groups of nodes sharing similar patterns within a network. With recent advancements in deep learning, methods utilizing graph representation learning and deep clustering have shown great results in community detection. However, these methods often rely on the topology of networks (i) ignoring important features such as network heterogeneity, temporality, multimodality, and other possibly relevant features. Besides, (ii) the number of communities is not known a priori and is often left to model selection. In addition, (iii) in multimodal networks all nodes are assumed to be symmetrical in their features; while true for homogeneous networks, most of the real-world networks are heterogeneous where feature availability often varies. In this paper, we propose a novel framework (named MGTCOM) that overcomes the above challenges (i)--(iii). MGTCOM identifies communities through multimodal feature learning by leveraging a new sampling technique for unsupervised learning of temporal embeddings. Importantly, MGTCOM is an end-to-end framework optimizing network embeddings, communities, and the number of communities in tandem. In order to assess its performance, we carried out an extensive evaluation on a number of multimodal networks. We found out that our method is competitive against state-of-the-art and performs well in inductive inference.
translated by 谷歌翻译