Graphical Models of False Information and Fact Checking Ecosystems

Haiyue Yuan¹ Enes Altuncu² Shujun Li Can Baskent

¹footnotemark: 1

²footnotemark: 2

Abstract

The wide spread of false information online including misinformation and disinformation has become a major problem for our highly digitised and globalised society. A lot of research has been done to better understand different aspects of false information online such as behaviours of different actors and patterns of spreading, and also on better detection and prevention of such information using technical and socio-technical means. One major approach to detect and debunk false information online is to use human fact-checkers, who can be helped by automated tools. Despite a lot of research done, we noticed a significant gap on the lack of conceptual models describing the complicated ecosystems of false information and fact checking. In this paper, we report the first graphical models of such ecosystems, focusing on false information online in multiple contexts, including traditional media outlets and user-generated content. The proposed models cover a wide range of entity types and relationships, and can be a new useful tool for researchers and practitioners to study false information online and the effects of fact checking.

\affiliation

[iCSS]organization=Institute of Cyber Security for Society (iCSS) & School of Computing, University of Kent, country=UK \affiliation[MiddlesexCS]organization=Department of Computer Science, Middlesex University, country=UK ^{equal_authorship}^{equal_authorship}footnotetext: Equal contribution

1 Introduction

In today’s highly digitised and globalised society, wide propagation of false information has become a more serious problem than ever. Different types of false information, including misinformation and disinformation (Howard-P2021; Bin-G2020; Wardle-C2017), are the sources of different forms of online harms on various online platforms such as social media, instant messaging applications, web forums, online chat rooms, and online news portals. Many researchers have studied different aspects of false information online, which include how different actors behave when facing false information, how false information is propagated, and how it can be better detected and mitigated via technical and socio-technical means (guo2020; kumar2018).

The use of human fact-checkers is a common way of detecting and debunking false information online. Many fact-checking organisations and networks have been established, such as Full Fact in the UK³³3https://fullfact.org/ and PolitiFact in the US⁴⁴4https://www.politifact.com/. Although many researchers have investigated fact checking (nakov2021; juneja2022), we noticed a lack of conceptual models describing the complicated ecosystems of false information online and the fact-checking community. Therefore, in this work, we propose the first graphical models of the above-mentioned ecosystems, focusing on false information online in multiple contexts, including traditional media outlets and user-generated content. The proposed models contain a wide range of entity types and relationships, and we give some example real-world scenarios to show the usefulness of the proposed models.

The rest of the paper is organised as follows. Section 2 presents related work on false information and fact-checking. Section 3 describes the proposed graphical models for the false information and fact-checking ecosystems. Then, Section 4 demonstrates how the graphical models can be used to describe different entities and relationships in a couple of real-world scenarios. Further discussions and future work are given in Section 5 and the last section concludes this paper.

2 Related Work

2.1 False Information Definition

Research has been conducted to study and classify false information to better understand its definition and scope. One popular approach is to categorise false information into misinformation and disinformation (Howard-P2021; Bin-G2020; Wardle-C2017), where the discrimination between the two are subtle (Stahl-BC2006). They can be distinguished by its ‘intention’ (Bin-G2020). Thereby, misinformation can be interpreted as the false information that is without the intention to mislead (Scheufele-DA2019; Kumar-K2014), e.g., the information is incorrect, possibly by accident. Contrarily, disinformation is to intentionally mislead the audience for some purposes via different approaches such as deceptive advertising, government propaganda, doctored photographs, and etc (Fallis-D2015). More specifically, European Commission proposed its definition of disinformation in one of their reports, as ‘all forms of false, inaccurate, or misleading information designed, presented and promoted to intentionally cause public harm or for profit’ (HLEG_Report2018).

In addition, Bailey-TC2019 suggested an additional category of false information, malinformation, that is intentionally misleading and used to inflict harm. Nevertheless, Wardle-C2017 argued that malinformation is used to describe the scenario that when genuine information is shared to harm rather than serve the public interest, such as ‘true information that violates a person’s privacy without public interest justification’ (Wardle-C2018).

Moreover, researchers have attempted to categorise false information from real life scenarios, aiming to have a better understanding of the actors, the motivations, and how false information is spread. Zannettou-S2019 categorised false information on the web into eight types including fabricated, propaganda, conspiracy theories, hoaxes, biased or one-sided, rumors, clickbait, and satire news. In addition, they identified a number of false information actors (e.g., bots, criminal/terrorist organisations, governments, activist or political organisations, journalists, etc.) with different motives (e.g., malicious intent, influence, profit, passion, etc.).

Similarly, Meel-P2020 suggested that the false information can be considered as information pollution with a number of identified motivations (e.g., political intent, financial profit, passion, increase customer base, etc.), and it has many formats, including rumor, fake news, misinformation, disinformation, clickbait, hoax, satire/parody, opinion spam, propaganda, conspiracy theories.

To better explain the complexity of false information categorisation, Meel-P2020 argued that all the categories of false information are sharing some heterogeneity. One example is fake news, which is overlapping with satire, misinformation, rumor, disinformation, and opinion spam. Considering the dynamic context of fake news and its broad scope, Molina-D2021 further suggested that fake news is beyond the spectrum of false information and concluded that fake news should be considered as an umbrella term with seven categories including false information(false news), polarized content, satire, misreporting, commentary, persuasive information, and citizen journalism.

2.2 Fact-checking

Due to the negative impact of false information/fake news on different aspects of the society at a global scale, academia, industry, international and national organisations have made significant effort to act against the generation and the spread of false information. On the international level, the European Commission launched the major counter-disinformation initiatives in 2018 (EU_Report2018a; Renda-A2018), set up a high-level group of experts (HLEG) to advise on relevant policy initiatives, and introduced a Eurobarometer survey and self-regulatory Code of Practices for the big social platforms (EU_Report2018b).

There are many different fact-checking outlets in Europe, and all share the same goal of promoting truth to the public. Graves-L2016 studied fact-checking sites in Europe and suggested that fact-checkers in Europe can be categorised into three types: 1) reporter normally considers the fact-checking mission in journalistic terms to inform the public; 2) reformer primarily uses fact-checking as part of an agenda of political reform; 3) expert is more or less considered as a think tank.

In addition, Graves-L2016 argued that there are two main fact-checking models: 1) the newsroom model is associated with established media companies, where the legacy news media remain the main source of political fact-checking, such as ‘FactCheck’ program from Channel 4 News, ‘Reality Check blog’ from Guardian; 2) the non government organisation (NGO) model represents fact-checking outlets operate independently or backed by NGO or charity such as ‘FullFact’⁵⁵5https://fullfact.org/ from UK and ‘Teyit.org’⁶⁶6https://teyit.org/ from Turkey.

World widely, the international fact-checking network (IFCN)⁷⁷7https://www.poynter.org/ifcn/ at Poynter⁸⁸8The Poynter Institute for Media Studies is a non-profit journalism school and research organisation in US https://www.poynter.org/ has a growing community of fact-checking/fact-checker organisations to advocate of factual information in the fight against misinformation/disinformation/fake news, where verified signatories of the IFCN includes ‘PolitiFact’⁹⁹9https://www.politifact.com/ and ‘FactCheck’¹⁰¹⁰10https://www.factcheck.org/ from United States (US), ‘FullFact’¹¹¹¹11https://fullfact.org/ from United Kingdom (UK), ‘20 Minutes Fake Off’¹²¹²12https://www.20minutes.fr/ from France, ‘Maldita’¹³¹³13https://maldita.es/ from Spain, ‘Africa Check’¹⁴¹⁴14https://africacheck.org/fact-checks from South Africa, ‘Annie Lab’¹⁵¹⁵15https://annielab.org/ from Hong Kong, ‘Australian Associated Press’¹⁶¹⁶16https://www.aap.com.au/ from Australia, ‘Teyit’¹⁷¹⁷17https://teyit.org/ from Turkey, ‘BOOM’¹⁸¹⁸18https://www.boomlive.in/ from India, and etc¹⁹¹⁹192022 full list can be found from https://ifcncodeofprinciples.poynter.org/signatories.

By reviewing the fact-check and false information/fake news detection methodologies, Collins-B2021 classified several common approaches: experts/professionals fact-checker approach, crowd-sourced approach, machine learning approach, natural language processing technique, hybrid technique, expert-crowdsource approach, human-machine approach, graph-based method, deep learning approach, and recommendation system approach. Based on these different approaches, researchers have also developed a number of fact-checking tools such as TweetCred (Gupta-A2014), Hoaxy (Shao-C2016), ClaimBuster (Hassan-N2017), Fake News Tracker (Shu-K2019), NudgeCred (Bhuiyan-M2021), and etc. In addition to the effort from scholars, players from tech industry has also joined the battle of false information/fake news and developed a number of tools, such as the ‘Fact Check Explorer’²⁰²⁰20https://toolbox.google.com/factcheck/explorer developed by Google, ‘Third-party fact-checking program’²¹²¹21https://www.facebook.com/journalismproject/programs/third-party-fact-checking by Meta, Twitter’s Birdwatch platform²²²²22https://twitter.github.io/birdwatch/, AI powered fact-checking platforms and services provided by Logically²³²³23https://www.logically.ai and CheckStep²⁴²⁴24https://www.checkstep.com, etc.

2.3 False Information and Fact-checking Ecosystems

Despite a plethora of research on false information and fact-checking/false information/fake news detection, research on the relationship between fact-checking (fact-checker) and false information, and the associated ecosystems are overlooked. Jiang-S2018 investigated the linguistic signals expressed in social media comments in the presence of misinformation and fact-checking, and found that the fact-checking can have positive effects on linguistic signals but can also cause potential ‘backfire’ effects.

Burel-G2020 studied the relation between misinformation and fact-checking information/report spread during the Covid-19 pandemic. They observed the similarity in how both information spread in social media and discovered that fact-checking information has a positive effect on combat misinformation, but with a short span due to the overwhelming amount of misinformation spread compared with the amount of fact-checking information.

Nicole-K2020 considered fact-checking for Covid-19 misinformation as risk communication, and identified that fact-check Covid-19 misinformation will be difficult tasks given the complicated relationship and varying levels of trust. Furthermore, Yang-A2021 looked at the relationship between fact-checkers and spreaders of Covid-19 vaccine misinformation on Facebook. They revealed that fact-checkers’ posts can received more comments, however, misinformation spreaders can more easily use the URL co-sharing network to coordinate communication strategies.

2.4 Research Gaps

To best of our knowledge, we are not aware of any existing works focus on understanding and exploring the false information and fact-checking ecosystem using graphical models or similar techniques. The main aim of this paper is to fill this gap by proposing a conceptual ecosystem for false information and fact-checking using a graphical model to discover and represent the complex relationship among different actors (e.g., media outlets, fact-checkers, false information spreaders, etc.) of the ecosystem.

3 Modelling the False Information and Fact Checking Ecosystems

The ubiquitous smart devices and fast developing social media platforms enable everyone to publish, consume, and comment digital information presented in different formats (i.e., text, image, audio, video). In addition, the legacy/traditional media outlets expand their services to the internet and social media platforms on top of their existing channels such as TV, radio, and newspaper, aiming to reach a wider audience. However, such development and revolution make easier to spread false information/fake news, but harder to detect and fact-check.

To study the scale and complexity, we propose two graphical models to describe the false information and fact checking ecosystem, aiming to exploring the complex relationships between different actors, as well as empowering users with knowledge and higher level of conceptualisation to better understand and model real-world scenarios. As shown in Figure 1, the graphical models focus on two different contexts where false information can be sourced: 1) traditional media outlets; and 2) user-generated content.

3.1 The Graphical Model for Traditional Media Outlets

The graphical model we introduce for traditional media outlets fact-checking are the foundational abstractions for several other potential formal models. First, it is the graphical depictions of multi-agent systems that are commonly used in AI. Second, it visualises a methodology to approach the issue of fact-checking from a “knowledge representation and modelling” perspective. Fact-checking is an interesting “laboratory” for knowledge representation and epistemic logic (that is the logic of knowledge) and the current model is a stepping stone to form a full epistemic model of the phenomenon.

The graphical model can be formalised as a directed graph as shown in Figure 0(a), representing different types of entities and their relationships. The graph can be denoted as $G = (V, E)$ , where $V = {V_{i}}_{i = 1}^{N}$ and $E = {E_{j}}_{j = 1}^{M}$ represent a set of N nodes and a set of M edges, respectively. Each node $V_{i}$ represents a type of entities that is depicted by a rounded corner rectangle in the proposed graph model, and each edge $E_{j}$ and the associated textual label describe a semantic relationship between the connected two entity types.

Round corner rectangles are used to represent entities, where entity types coloured in yellow representing different types of information (e.g., news, report), and entity types coloured in green representing document based entities and other tools/datasets/resources based entities.

All entity attributes are depicted by light blue coloured ellipses. It is worth noting that the reason to single out these specific attributes is they are relevant and essential to better describe the context of different semantic relationships between entity types, and thereby to facilitate explaining and modelling the ecosystem. However, the inclusion of these attributes does not mean to discard or ignore other attributes that are not explicitly shown in the Figure 0(a). More different attributes can be added to different entity types to enhance the compatibility and scalability of the graphical model for modelling more specific scenarios under specific contexts.

Moreover, solid lines with textual labels refer to semantic relationship between two entity types, where ‘fact-checked’ and ‘regulates’ are coloured in blue and red, respectively, to emphasise the important ‘fact-checking’ side of the ecosystem as well as offering better readability. It is worth noting that we use verbs in different tenses for different textual labels to capture the importance of time in a semantic relationship.

3.1.1 Entity type

In this model, $N = 17$ different entity types are proposed with the following descriptions:

Law (L)

refers to an official rule made by a government or some other authority.

Regulation (RL)

corresponds to rules that implements principles of an L, bringing an L into effect (Kosti-N20199).

Comments (C)

represents a person’s opinion, feedback, or other information in response to published news.

Fact-check report (FCR)

refers to fact-checking results, which can be published by a fact checker or a fact-checking outlet in different formats such as a report, a blog, a news item, etc.

Person (P)

refers to a natural people in the physical world. For this entity type, we highlighted two specific attributes as we believe they are more relevant and also essential to better explain and model the ecosystem. Attribute is journalist is added to the P entity, as a journalist is an integral part of traditional media outlets. In addition, normally it is the professional conduct for a journalist to fact-check the content/news, and even a person is not a professional journalist, we believe there are still some moral standards to fact-check the content/news to some extent. Regardless of whether a person is a journalist or not, he/she can be a fact-checker, hence the attribute fact-checking is attached to a P entity.

Journalist association (JA)

refers to journalism organisation or journalist union that dedicates to encourage responsible reporting and ethical behaviour. e.g., British Association of Journalists²⁵²⁵25https://bajunion.org.uk/ and National Union of Journalists²⁶²⁶26https://www.nuj.org.uk in the UK, and Society of Professional Journalists²⁷²⁷27https://www.spj.org/ in the US.

News draft (ND)

stands for news content produced by a person/journalist that has not been published.

News (N)

represents news published by a media outlet. Both N and ND can be physical or digital assets in different formats, such as text, image, video, and audio.

Media outlet (MO)

refers to traditional broadcasting/press channel that can provide information in different formats to the public by different paths including newspaper, magazines, television, radio, etc. (Examples include The Telegraph²⁸²⁸28https://www.telegraph.co.uk, British Broadcasting Corporation (BBC)²⁹²⁹29https://www.telegraph.co.uk from the UK and The Cable News Network (CNN)³⁰³⁰30https://edition.cnn.com/ from the US).

Media outlet association (MOA)

refers to an organisation that serves the shared interests of media outlets to protect general interests of its members in different aspects such as political, regulatory, and legal matters. (Examples include News Media Association³¹³¹31http://www.newsmediauk.org in the UK and News Media Alliance³²³²32https://www.newsmediaalliance.org/ in the US).

Fact-checking outlet (FO)

is an organisation dedicated to perform fact-check or offer fact-checking services. We have listed few examples such as ‘FullFact’ in the UK and ‘PoliFact’ from the USA in Section 2.

Fact-checking association (FA)

refers to an association that have members of fact-checking outlets which share the same commitment and purposes to promote fact-checking services/duties.

Organisation (O)

refers to other organisations that are related to this graphical model, but have not been explicitly represented as a entity type in the graphical model. Here, attribute news reporting is added to the O entity to state the fact that although many organisations are not media outlets, they regularly publish stories, report on their websites or via social media platforms.

Region/Country/Local (RCL)

represents different level of governmental entities. ‘Region’ refers international or multi-national entities that are beyond a singular country, such as the European Union, whereas ‘local’ refers to local administrative areas in a country, such as ‘county’ in the UK and ‘state’ in the US.

Regulator (R)

refers to the regulator that regulates other services (e.g., Ofcom³³³³33https://www.ofcom.org.uk/ in the UK). As some regulators have the capability and/or responsibility to fact-check information, fact-checking attribute is added to the entity type.

Standards/Guidelines (SG)

corresponds to rules that can be made by an organisation, or an association, or even an expert in a specific domain, and these rules are set/recommended to be followed.

Services & Resources (SR)

corresponds to the services and resources, such as platforms, tools, and datasets provided by an FA or an FO to the public for fact-checking. This includes periodic newsletters about recent fact checks, databases of previously fact-checked articles, and fact-checking tools (e.g., Logically Browser Extension³⁴³⁴34https://www.logically.ai/products/browser-extension).

3.1.2 Edges

As stated before, an edge with a textual label denotes a semantic relationship between two entity types. As illustrated in Figure 0(a), it is a quite complex graph with many different relationships between entity types. To better explain the graph, we first explore the graph (i.e., mainly on the right hand side) concerning about the entity types with their associated edges that are mainly related to information generation, such as publishing a news article. Then, the focus moves to the fact-checking side of the ecosystem (i.e., mainly on the left hand side of the graph). Next, entity types and their edges (i.e., mainly on top of the graph) that are related to regulate both information generation and fact-checking are further explained. As shown in top right of the Figure 0(a), an SG entity is designed to have many arrows point outwards to simplify the visualisation. Such representation is to reflect that the SG entity has relationships with many other entity types in this model.

Information Generation

As illustrated in Figure 0(a), starting from the P entity, the attribute is journalist indicates that a P can be a professional journalist, who is also likely belong(s/ed) to a JA, or a P can also be an amateur/grass root journalist or anyone. A JA should mandatorily follow (e.g., all registered journalists of the NUJ should act according to the Code of Conduct set by the NUJ), however, this does not prevent other unassociated independent journalists to follow it voluntarily.

In this model, there are two ways (i.e., two published edges that connect the N entity and the MO entity, the N entity and the P entity respectively) to publish an N: 1) a P can create an ND to be reviewed by an MO, and the same MO can release an N using different approaches, such as via a physical newspaper, a TV program, its online website, its social media account, and etc.; 2) a P can also directly publish an N independently via various channels, such as web blogs, personal websites, social media accounts, etc.

Nowadays, many MOs publish Ns on their websites and allow readers to leave comments, this is also applied to the N published by a P independently. In addition, some of the MOs might only publish news in physical format such as newspaper. It is still possible for other readers to leave comments by writing/publishing on a physical newspaper or via online platforms in response to a specific news item. Here, consumed and published relationships between the P entity and the C entity are designed to model such scenarios.

Fact-checking

A P entity with a true value of the attribute fact-checking indicates that a P has the ability to perform fact-checking, which suggests he/she can be an independent fact-checker. In addition, P entity belong(s/ed) to an FO entity, which itself is specialised at fact-checking. Both independent fact-checker (as a P) and an FO can publish an FCR as a result of fact-checking. The published/revised relationship between an MO entity and an N entity, and between a P entity and an N entity mean that both P and MO can revise an N, which could be caused by a number of reasons (e.g., one reason could be that the N was fact-checked and need updates). In addition, since both P entity and FO entity have belong(s/ed) to relationships to an FOA entity, an SG set by an FOA should guide its associated P and FO to perform fact-checking. For example, IFCN as an FOA sets the Code of Principles, which should be followed by all verified signatories of the IFCN.

Not only a fact-checker or an FO can fact-check information, an MO can also perform fact-checking such as the ‘FactCheck’ program from the media outlet ‘Channel 4 News’, and the ‘Reality Check’ program from ‘BBC’. Such fact-checking programs normally have advantages to reach wider audiences which exceed the reach of many independent fact-checkers/fact-checking outlets (Graves-L2016).

It is worth noting that an SR entity connects an FOA entity, an FO entity, a P entity, and an MO entity via created and/or consumed edges, representing that the FOA and the FO can create a number of different SR in different formats, such as platforms, tools, and datasets to share with the fact-checking community. Hence, other FOAs, FOs, P, and MOs can use the SR to cross reference, fact-check, and conduct more analysis.

Moreover, a JA entity, an FO entity, and an MO entity can all have the belong(s/ed) to relationship with both O entity and RCL entity, meaning that a JA, an FO, and an MO can be either fully/partial funded/supported by a RCL level-government or fully/partial funded/supported by an O, which can be either owned privately or by a RCL. This could raise an interesting phenomenon, where the information published and fact-checked could be still biased to some extent, when the JA, the FO, and the MO are all funded/controlled by the same party. Differently, if the JA, the FO, and the MO are funded/controlled by different parties, it could result different scenarios with more complex relationships such as competition, collaboration, coalition, which deserves further analysis.

Regulating Information Generation and Fact-checking

Different regulations and/or laws need regulators to implement in different contexts. This has been modelled in the graph as the implements relationships between an RL entity and an R entity, and between an L entity and an R entity. As shown as red edges in Figure 0(a), an R entity regulates a number of entity types including an MOA entity, an MO entity, an O entity, an FO entity, and a JA entity. For example, the Charity Commission³⁵³⁵35https://www.gov.uk/government/organisations/charity-commission (i.e., the R entity in the model) regulates charity organisations such as fact-checking outlet ‘FullFact’ (i.e., the FO entity in the model), and its powers and procedures are set out in the Charities Act 2011 (i.e., the L entity in the model). In addition, the Charity Commission also publish a number of guidance (i.e., the SR entity in the model), e.g., ‘Trustee role and board’, ‘Money, tax and accounts’, to help set up and run a charity organisation.

3.2 The Graphical Model for User-generated Content

Similar to the model reported in Section 3.1, a directed graph as shown in Figure 0(b) can be denoted as $G_{b} = (V_{b}, E_{b})$ , where $V_{b} = {V_{b}_{i}}_{i = 1}^{N}$ and $E_{b} = {E_{b}_{j}}_{j = 1}^{M}$ represent a set of N nodes and a set of M edges respectively. We will refer this model as graphical model B. There are $N = 12$ entity types, 6 of them including RCL, R, O, P, F, and FA have been explained in Section 3.1, and the rest of the entity types for this model are described below.

User generated content (UGC) is the content user produced and published to public or privately within in a social group/online group via his/her online account.
Account (AC) refers to online accounts, which are ‘virtual identities’ for online services (e.g., Google³⁶³⁶36www.google.com account, Facebook³⁷³⁷37www.facebook.com account, etc.). Two attributes are added to this entity type to better describe the context and model the behaviours. As some people have the capability to fact check the content published by his/her own account or other accounts, fact checking is added to the AC entity. In addition, people can also create fake account for whatsoever reasons, attribute ‘is false’ is added here to reflect this matter.
Service (S) refers to different online services (e.g., Google, Facebook, Instagram³⁸³⁸38www.instagram.com, etc.).
Service provider (SP) represents organisations that relate to one or more services.
Social group (SG) refers to a physical group of people that are socially connected.
Online group (OG) represents virtual groups of online accounts that are hosted by specific online services.

Similar to the approach to describe graphical model A, we will start explaining graphical model B from P entity and further expand to explore other edges between different entity types. As shown in Figure 0(b), P entity owns and uses AC entity to creates UGC entity, which is the only way for a person to publish UCG, where UCG can be in various formats depending on the S entity that the AC entity belongs to, e.g., UCG can be: Tweets published/retweeted via Twitter³⁹³⁹39https://twitter.com/, product review by a consumer via Amazon⁴⁰⁴⁰40https://www.amazon.com, video published at YouTube⁴¹⁴¹41https://www.youtube.com, messages via WhatsApp⁴²⁴²42https://www.whatsapp.com and WeChat⁴³⁴³43https://www.wechat.com, etc.

The consumes and comments relationships between AC entity and UCG entity denote that online accounts of the same service can interact with each other, meaning that an online account can read/view and leave comments/feedback to UCG created by him/herself or other online accounts. In addition, the attribute fact checking of AC entity suggests that an online account can fact check his/her own UCG or UCG created by other online accounts. The attribute fact checking is also attached to P entity, denoting that a person without an account could also do fact check on the UGC that is created for public access.

The inclusion of the attribute is false of AC entity is to model scenarios that a person creates a fake account to spread misinformation/disinformation/malformation to achieve different purposes (e.g., fraud, online grooming, terrorism, etc.). However, this does not suggest that all UCG created by fake accounts have malicious intention. Some of them might just want to protect their privacy or for the convenience (e.g., create an fake account to play games).

Moreover, the creates, manages, and belongs to attributes linking AC entity and OG entity describe the scenarios of an online account can create a online virtual group and other online accounts belong to this group can also manage the group. The similar mapping relationship in the physical world is the belongs to relationship between P entity and SG entity. Here, such edges could have impact on the type of UCG published. For instance, if a person belongs to either an OG or a SG that its members share the same religion or political view, the UGC created might need to fit into the SG/OG’s agenda or at least have the common interests. Similarly, O entity would have the same effect on P entity to influence the UGC created.

The development of the internet technologies have resulted a plethora of services (S)/platforms and service providers (SP) allowing everyone to easily create and share UGC such as their stories, videos, pictures, etc. This facilitate the generation of new types of media outlets that are different from the tradition media outlets. We Media is the most representative case. This term was firstly introduced by (Bowman-S2003), and refers to news generated by grass-root media outlets.

We Media can grow into large and influential media outlets (such as celebrity’s social media, YouTube influencer, etc.) in terms of subscribers, but their operation remains largely around a single individual. If we ought to use the proposed graphical model B to model its ecosystem, a We Media can be represented as a P entity and an AC entity. Then the relationship between other entity types (i.e., O, SG, OG, S, SP, and RCL) with P and AC entity can be further explored to model different incentives (such as financial, political, religion, etc.) of We Medias. Since We Media operators are not trained journalists, their ethical standards and practical knowledge are likely much lower and some can behave unethically to pursue monetary gains only. Hence, certain level of regulating and fact checking are needed, similar to the graphic model A, R entity and FO entity and FOA entity with a number of regulates and fact check edges as shown in Figure 0(b) can have their impact on the We media ecosystem.

3.3 Additional remarks

As shown in Figure 1, one entity type can have same semantic relationships with multiple entity types for both models. For instance, in Figure 0(a), MO, a FO, a P can all fact check N. In such situation, the competitive or collaborative fact-checking relationship from multiple entities must be present. To get more insights about such phenomenons, it would require to further investigate the other entities types with their edges connecting to them.

Such snowballing effect can lead to interesting observations and findings that would require to use other theories such as game theory to offer better elaboration. One example would be the game between regulators and news outlets and fact checkers, it would consider not just financial benefits and reputation gains, but also legal costs. Note that regulators may also act illegally and can pay legal costs if their wrong decisions are challenged by large media outlets or fact checkers.

In addition, as shown in Figure 1, entity types colored in light green are used in both parts of the graphical model. They are also the anchor points to link and merge the two models in to one. The advantage of producing one overall graphic model is to offer a more straight forward way to compare traditional media outlets with new forms of media outlets such as We Media in terms of the incentives, business models, fact-checking practices, and the competitions/collaborations between the two. As the model grows, the relationships between entity type can become more complex, which allows modelling a wider range of scenarios such as comparing the traditional media outlets with UCG based media outlets.

4 Modelling Real-World Scenarios

In this section, we show how the proposed graphical model can be used to study relevant real-world scenarios. All diagrams of the five presented example real-world scenarios are produced following the same colour scheme used for the graphical model as depicted in Figure 1. The example scenarios are selected to cover a representative subset of the overall graphical model, representing a sub-ecosystem.

4.1 BBC Breakfast Incident

On 25th February, 2022, BBC Breakfast played a clip of military planes and claimed that they are Russian military flying into Ukraine following the invasion of Ukraine. As shown in Figure 2, Sarah Turnidge from FullFact, who is also a journalist, conducted fact-checking on this video clip and prepared a fact-check report, which was later published as a news release by FullFact. The report stated that the footage from BBC Breakfast was preparations for a military parade near Moscow from May 2020, but not about the invasion as it claimed. As depicted in Figure 2, from the audience perspective, Person B is more likely to spread the misinformation than Person A who first watched the BBC Breakfast show, and then, later read about the fact-check report released by FullFact. One interesting finding as illustrated in the figure is that the video clip broadcast on BBC Breakfast show was originally from a wide spread video clip on Twitter, and which has been fact-checked by different fact-checking outlets: 1) On 24th Feb 2022, Maldita fact-checked the video clip on Twitter and published a news release on its website; 2) On 24th Feb 2022, another journalist, also a fact-checker Abbas Panjwani from Full Fact conducted fact-checking on this video clip, and a corresponding news releases was published by FullFact; 3) On 25th Feb 2022, AFP Fact Check debunked this and released a statement on its Twitter account.

Figure 2: Modelling a real-world scenario that involves BBC and a number of fact checkers

Despite all the fact-checking reports were priorly available, such misinformation was still in the wild. Based on Sarah Tunidge’s fact-checking report (Turnidge-S2022), one spokesperson from BBC confirmed that “The footage was used once briefly in error and the team have been reminded about verifying images”. This suggests that BBC have an internal review team equipped with a certain level of internal review or fact-checking mechanism as demonstrated in Figure 2, yet the video clip with the false claim was broadcast. There might be some reasons and intentions to support this, but it is not our position to speculate. However, if we further extend this sub-graph to bring in other entity types and semantic relationships, more analysis can be done to investigate this incident further.

4.2 Fact-checking Services and Resources

This scenario includes modelling fact-checking services and resources provided by fact-checking organisations, companies, associations, and networks by using the proposed graphical model. Figure 3 exemplifies the scenario with multiple services and resources that a person can benefit for performing fact-checking. Given a news article including a number of suspicious claims, Person A examines if the claims have been previously fact-checked by any fact-checking organisation, e.g., Full Fact, and Snopes, through their fact-checks archives where Person A can search for previous fact-checks. Besides, Person A can review the recent fact-checks via periodical watch newsletters published by IFCN, i.e., Factually Newsletters, which contains the recent fact-checks collected from the verified signatories of IFCN. Nevertheless, Person A can also subscribe to the newsletters published by the IFCN-verified fact-checking organisations directly to receive more country-specific fact-checks. In addition, Person A can also publish comments in correspondence to the news article, as well as to read comments published by others.

Figure 3: Modelling fact-checking services and resources provided by fact-checking outlets

Apart from fact-checking organisations and networks, many companies provide solutions that others can make use of while fact-checking. If the suspicious claims have never fact-checked by a fact-checking organisation, Person A requests verification from Logically’s IFCN-verified fact-checking team through the Logically App⁴⁴⁴⁴44https://www.logically.ai/products/app. Other than the content of the news article, Person A can assess the credibility of the news source by checking the source’s NewsGuard Rating⁴⁵⁴⁵45https://www.newsguardtech.com/solutions/newsguard/, which is the trust rating published by NewsGuard.

4.3 Different Regulators in the UK

In this section, we intend to demonstrate that the sub system for regulation part of the graphical model proposed in Section 3 can be explored and enhanced to reveal the complexity of the regulation part of the ecosystem.

There are many active independent regulators (i.e., this can be considered as a sub-type of Regulator entity) in the UK. Before September 2014, The Press Complaints Commission (PCC) was the voluntary regulator for British printed newspapers and magazines (N.B., we used dotted red lines and past tense ‘regulated’ to indicate such relationship). In this scenario, we list representative media outlets including Telegraph, Daily Mail, The Mail, The Financial Times, The Independent, and The Guardian. However, after September 2014, The Independent Press Standard Organisation (IPSO) replaced PCC as an independent regulator, implementing the Editor Code created by Editor Code Committee, continuing regulating Telegraph, Daily Mail, and The Mail. As show in Figure 4, IPSO is financed by Regulatory Funding Company (RFC), who is funded by Telegraph Media Limited and Associated Newspaper Limited. It is worth noting that, Telegraph Media Limited owns Telegraph and Associated Newspaper Limited owns Daily Mail and The Mail. Such relationships raised some concerns, hence The Financial Times, The Independent, and The Guardian refused to join IPSO. Instead, each of them chose to publish their own standards/guidance to self-regulate as illustrated in Figure 4.

Figure 4: Modelling independent regulators that regulates the media outlets in the UK

4.4 Different Journalist Types

In this scenario, we aim to demonstrate the relationships between different types of journalists and their associations with the proposed graphical model. As shown in Figure 5, for the journalist associations on a national scale, we preferred to cover a number of US-based associations as an example.

For journalists working for a media outlet, i.e., reporters in Figure 5, The Society of Professional Journalists (SPJ)⁴⁶⁴⁶46https://www.spj.org/ and The Radio Television Digital News Association (RTDNA)⁴⁷⁴⁷47https://www.rtdna.org/ are two associations representing journalists. Both associations provide their Code of Ethics as a guideline to their members to follow while performing their work. Besides, they provide several tools and resources to support journalism activities of their members.

Another group of journalists are independent journalists, i.e., freelance journalists, which are professional journalists that does not work for any media outlet, but publish news on their own. For independent journalists, there also exists some communities and organisations that provide support for their journalism activities. SPJ has a community dedicated to independent journalists, called SPJ Freelance Community⁴⁸⁴⁸48https://www.spj.org/freelance.asp, that offers tools and resources to help their members, as well as organises events and programs. Another example is Investigative Reporters and Editors Inc. (IRE)⁴⁹⁴⁹49https://www.ire.org/, which is a grassroots non-profit organisation dedicated to supporting investigative journalism. IRE supports independent journalists through the Freelance Investigative Reporters and Editors (FIRE) project⁵⁰⁵⁰50https://www.firenewsroom.org/, which is a service bureau providing grants and editorial services for freelance journalists. IRE is a member of Global Investigative Journalism Network (GIJN)⁵¹⁵¹51https://gijn.org/, and all IRE members must follow the IRE Principles.

Finally, citizen journalists are another type of journalists, where individuals publish news collaboratively. Wikinews⁵²⁵²52https://www.wikinews.org/ is a project of Wikimedia Foundation, that serves as a platform for citizen journalism, where users can collaboratively publish news stories provided that they follow the Wikinews Policies and Guidelines.

Figure 5: Modelling different journalist types and journalist associations

4.5 Trump’s Twitter Account Suspension Incident

The second scenario is based on a real incident, in which Twitter permanently suspended the account of the ex-President of the US, Donald Trump. Figure 6 shows the graphical model of the relationships between different actors involved in the incident by using our proposed model. On 6 January 2021, Trump tweeted a video message containing some false claims around that the presidential election had been stolen, and calling his supporters to action (time2021capitol). Consequently, Trump’s supporters gathered, and attacked to the US Capitol, which caused the death of five people. As a result, on 8 January 2021, Twitter permanently suspended Trump’s Twitter account due to “the risk of further incitement of violence”, based on its Civic Integrity Policy. And, the US’ communications regulator, The Federal Communications Commission (FCC), did not object to the Twitter’s decision (reuters2021fcc).

Figure 6: A graphic model of Donald Trump’s twitter account and its associated entities

5 Further Discussions and Future Work

The proposed graphical model can be used for studying false information online and fact checking in many different ways. In this section, we discuss some of the possible applications and possible future research work.

One obvious extension of the proposed graphical model is to define a computational ontology and to automatically build such an ontology from different (closed and public) data sources. Doing so is not trivial since automatically discovering entities and relationships between different entities can be challenging. It is likely advanced techniques including NLP (natural language processing) techniques such as named entity recognition (NER) algorithms and supervised machine learning methods will be necessary.

As demonstrated in the previous section, the graphical model can be used to guide more systematic categorisation of real-world scenarios of false information and fact checking, and how the different scenarios are related to each other. Such categorisation can help researchers, practitioners and policy makers to better understand the complexity of the false information and fact checking ecosystems, to identify better interventions, and to evaluate such interventions. We envisage the categorisation can also help guide future literature reviews on related topics.

The graphical model proposed and a computational ontology derived from it can be used to facilitate large-scale Internet measurement studies to better understand how false information spreads and how different actors behave. Via automatic reasoning, the computational ontology can be used to automatically discover new phenomena related to false information and fact checking.

A computational ontology derived from the proposed graphical model can be incorporated into existing fact checking tools to make them more aware of the graphical nature of the ecosystem. For instance, a fact checking tool could leverage the computational ontology to identify the most critical nodes and paths in order to identify the best communication strategy.

The graphical model can also help guide the design of empirical studies, e.g., on identifying relevant stakeholders so that the recruitment of human participants can be more representative.

Another important area of applications of the graphical model is modelling and computer-based simulation of the false information and fact checking ecosystems, especially agent-based models, which can directly borrow the entities and relationships in the graphical model.

Finally but not the least, the many different types of conflicting and cooperative relationships between different entities in different false information and fact checking scenarios can also help researchers to use various theoretical tools such as game theory and epistemic logic to rigorously study spread of false information, changes of behaviours of different entities, and also the overall evolution of such ecosystems.

6 Conclusion

In this paper, we report two comprehensive graphical models of ecosystems around false information and fact checking, in order to conceptually capture the complicated relationships between many different entity types. The usefulness of such graphical models is demonstrated using some example real-world scenarios, and its wider applications for studying false information online and fact checking are discussed. It is our hope that the work can stimulate more work on conceptual modelling of false information online and fact checking.