[

[ h.zuo19@imperial.ac.uk [ jingqz@zju.edu.cn [ holly1027@zju.edu.cn [ 3190102536@zju.edu.cn [ sunly@zju.edu.cn [ p.childs@imperial.ac.uk [ chenlq@zju.edu.cn Department of Computer Science and Technology, Zhejiang University,Hangzhou,310030,China Dyson School of Design Engineering, Imperial College London,Exhibition Rd, South Kensington,SW7 2AZ,United Kingdom

Abstract

Data-driven design and innovation is a process to reuse and provide valuable and useful information. However, existing semantic networks for design innovation is built on data source restricted to technological and scientific information. Besides, existing studies build the edges of a semantic network only on either statistical or semantic relationships, which is less likely to make full use of the benefits from both types of relationships and discover implicit knowledge for design innovation. Therefore, we constructed WikiLink, a semantic network based on Wikipedia. Combined weight which fuses both the statistic and semantic weights between concepts is introduced in WikiLink, and four algorithms are developed for inspiring new ideas. Evaluation experiments are undertaken and results show that the network is characterised by high coverage of terms, relationships and disciplines, which proves the network’s effectiveness and usefulness. Then a demonstration and case study results indicate that WikiLink can serve as an idea generation tool for innovation in conceptual design. The source code of WikiLink and the backend data are provided open-source for more users to explore and build on.

\csdef

WGMwgm\xspace \csdefQEqe\xspace \csdefEPep\xspace \csdefPMSpms\xspace \csdefBECbec\xspace \csdefDEde\xspace \ExplSyntaxOn\keys_set:nn stm / mktitle nologo \ExplSyntaxOff

mode = title]WikiLink: an encyclopedia-based semantic network for design innovation

2]Haoyu Zuo[type=editor, auid=000,bioid=1, orcid=0000-0003-3811-4479,style=chinese]

1]Qianzhi Jing[style=chinese]

1]Tianqi Song[style=chinese]

1]Huiting Liu[style=chinese] 1]Lingyun Sun[style=chinese]

2]Peter Childs

1]Liuqing Chen[style=chinese] \cormark[1]

\cortext

[cor1]Corresponding author

esign innovation \sepConcept generation \sepData-driven design \sepKnowledge discovery \sepSemantic network

1 Introduction

Design is a ubiquitous process that occurs throughout a variety of fields. Conceptual design is the early stage of design where an initial idea is formulated (Childs, 2013). The progression of conceptual design development requires a designer to fully utilize their innovation capability and existing knowledge. In other words, the creative attributes of conceptual design depend highly on a designer’s ability to master, apply, and utilize human-centred, scientific and technological knowledge according to the design problem to provoke design innovation. Researchers have utilized a large amount of imagery data or textual data available on the internet to provide design intuition for novel ideas. This imposes a heavy challenge (Hao et al., 2014) for designers on how to effectively discover and acquire pertinent knowledge and information to promote design innovation.

With the advent of big data, semantic networks can represent associations well between ontology-based knowledge, making it easier and more intuitive to discover implicit knowledge for design innovation. The highly diverse nature of design suggests that design innovation can benefit from a multiplicity of distinct data. However, existing semantic networks for design innovation are built on data sources restricted to technological and scientific knowledge. Besides, existing studies build the edges of a semantic network only on either statistical or semantic relationships, which is less likely to make full use of the benefits from both types of relationships and discover implicit knowledge for design innovation.

To address the challenges highlighted, this study proposed an encyclopedia based network called WikiLink for design innovation. The source code of WikiLink is published on https://github.com/zju-d3/WikiLink. The main contributions of this paper can be summarized as follows:

[(1)]
A semantic network for design innovation is constructed. Wikipedia is applied as the data source for the semantic network, which contains information from a wide range of fields and expands the data to a new boundary.
A combined weight is introduced for the relationship in the semantic network. The combined weight mixes the statistical relationship and semantic relationship which better captures the implicit connection between concepts for design innovation. Four algorithms are further developed for design which enables the retrieval with different levels and manners.
The constructed semantic network for design innovation is further developed as a tool. An evaluation and demonstration for the tool are conducted subsequently, the results show that WikiLink can effectively provide design stimuli for idea generation.

The paper is organised as follows: section 2 describes the state of knowledge and background for the research, and section 3 introduces the process of constructing WikiLink. Section 4 presents the experimentation including the results on coverage of concepts, coverage of relationships, coverage of disciplines and term to term relationships. Section 5 demonstrates the use of four functions in WikiLink and presents a design case with WikiLink. Finally, section 6 concludes with limitations and suggestions for further research directions.

2 Related Work

2.1 Design innovation and idea generation

Design can be regarded as the process of conceiving, developing and realising products, artefacts, processes, systems, services, platforms and experiences with the aim of fulfilling identified or perceived needs or desires typically working within defined or negotiated constraints (Childs, 2013). Design innovation is the progress of creating innovative design, which needs the designer to fully utilize their ability to generate a design idea. Normally, the whole design innovation process can benefit from considering as many ideas as possible (Liu et al., 2003). Ideas, especially creative ideas, are an essential part of the design innovation process (Han et al., 2018b, a).

Much research has endeavored to propose novel approaches for idea generation. The diverse idea generation techniques include brainstorming (Osborn, 1953), brainwriting (Geschka, 1983), checklists (Ivanov and Cyr, 2014), and synectics (VanGundy, 1988). Recently, data-driven approaches have attracted researchers’ attention. In the process of design innovation, data-driven approaches attempt to uncover useful design knowledge from huge, unstructured, heterogeneous, and highly contextualized data resources (Shi et al., 2017; Cheong et al., 2017; Luo et al., 2021). Researchers emphasize the importance of generating creative ideas in design the innovation process from big data (Howard et al., 2008; Kwon et al., 2018a) and further indicate that creative ideas can originate from diverse existing knowledge and defined associations.

2.2 Semantic network

A semantic network is a graph with nodes representing concepts or individual objects and edges representing relationships or associations among concepts (Sowa, 1987). The use of a semantic network can help integrate and migrate valuable, unstructured data into systematic robust knowledge for design innovation (Gorti et al., 1998; Rezgui et al., 2011; Georgiev and Georgiev, 2018).

When design work is completed, a great number of data and information are usually accumulated and reported afterwards (Ackoff, 1989), in the format of proceedings, literature, patents or public reports. These pieces of recorded information are expected to be transformed into design knowledge, which is expected to be reused for unhappened design tasks, to speed up more design work. When considering knowledge reuse, common knowledge sources generally include research papers, patent documents, encyclopedias.

Academic papers and patents are original research outcome or totally new inventions, which contain rich scientific and technological knowledge. Several attempts (Munoz and Tucker, 2016; Fu et al., 2013; He et al., 2019a; McCaffrey and Spector, 2018; Shi et al., 2017; Sarica et al., 2020) have been made to apply the academic paper and patents to a design innovation task. However, one of the major limitations is that patents and scientific literature are restricted to only technological and scientific knowledge (Shibata et al., 2008; Furukawa et al., 2015; Li et al., 2019; Ernst, 2003), while the nature of design tasks is of high diversity and complexity, with broad coverage of disciplines. To address the issue, an encyclopedia can be applied for design innovation since the most notable advantage of an encyclopedia is that it contains information from a wide range of fields and can expand the design knowledge coverage to a wider boundary compared with paper and patents (Kwon et al., 2018b).

2.3 Semantic network for design innovation

Over the past decade, several general semantic networks have been developed such as the lexical database WordNet (Fellbaum, 2010) and ConceptNet (Speer et al., 2017). These general semantic networks were first developed for artificial intelligence tasks such as machine translation, natural language understanding (Sowa, 1987). These lexical semantic networks are utilized increasingly in the engineering design domain. They are often employed as the backend knowledge to computational tools for design idea generation and analysis (Han et al., 2018b, 2020; Bae et al., 2020; Georgiev and Georgiev, 2018). However, these lexical databases built on common-sense knowledge are not specifically aimed at use in design innovation.

Thus, there is an impetus for developing a design innovation-focused semantic network to meet the growing demands for engineering knowledge discovery, technology information retrieval, engineering design aids and innovation management. An innovation-focused semantic network normally builds nodes retrieved from a reliable data source and establishes the association based on statistical or semantic relationship. The statistical relationship that represents the value on associations are assigned with a statistical calculation. For example, Shi et al. (2017) created a large semantic network with statistical relationships in the engineering and design domain. Its statistical relationships are built on the co-occurrence between each pair of words in nearly one million engineering papers and one thousand design posts. He et al. (2019b) created a semantic network with a core-periphery structure according to the word clouds embedding co-occurrences information. In this way, the semantic network built the edges on a statistical level and could support engineering and technology innovation from a statistical perspective.

The semantic relationships are the associations that there exist between the meanings of words and are applied in many design activities, such as analogy and metaphor methods (Johnson, 1992; Goel, 1997). As for semantic network for design innovation, Sarica et al. (2019b) built a large-scale comprehensive semantic network of technology-related data for engineering knowledge discovery (TechNet). The semantic relationships between words are established by using natural language processing techniques to derive the vector of such terms. Kim and Kim (2012) suggest a cause-and-effect relationship to build a cause-and-effect function network to support technology innovation. With semantic relationships, the network could support data integration, knowledge discovery and in-depth analysis from a semantic perspective (Sarica and Luo, 2021; Sarica et al., 2019a, 2021).

This study plan to build a large encyclopedia based semantic network with statistical-semantic fused relationships. Inspired by the use of statistical relationship in a semantic network and semantic relationships in the design engineering domain, we aim to build a semantic network that combines the benefits of both the statistical relationship and the semantic relationship to better capture the implicit connection of cross-domain concepts to better stimulate design innovation.

3 Construction of WikiLink

In this section, we constructed WikiLink, a semantic network based on Wikipedia data. The Wikipedia items are regarded as the nodes, the interlinks between the items on the same page are regarded as the directly connected relationship (edges) between nodes. The edges in the network are assigned with a fused weight consisting of two types of weight, and four algorithms are proposed to retrieve relevant knowledge concepts and relationships for design innovation.

3.1 Data source

While patents and scientific literature focus on technological and scientific knowledge, an encyclopedia is an integrated source of general knowledge and specific knowledge, with broad coverage of disciplines. Wikipedia, as an online encyclopedia, is unrestricted by the weight and volume, and has the potential to be truly comprehensive in knowledge. Wikipedia is written and maintained by a community of volunteers and offers copies of available content to anyone to download. WikiLink processes on English Wikipedia pages before 3rd January 2021, comprised 6,408,679 articles. For each Wikipedia article, WikiLink extracts the titles, main text, ”see also” and categories for further analysis. Figure 1 is an example page of a Wikipedia article, containing a title, main text, ”see also” and categories. It should be noted that articles with a colon in the title are excluded. These articles with a colon account for 10% of total articles, which are Wikipedia’s administrative pages and are not relevant as the core source of design information.

3.2 Extraction process

Wikipedia covers 13 main categories to group pages on similar subjects, with each main category having up to 6 layers of subcategories. The deeper the subcategory is, the more specific Wikipedia’s title will be. The articles are firstly filtered based on the indicated categories on their article pages to avoid too specific articles: only the articles within 3-layer subcategories are kept. The network is constructed based on these selected articles’ title, main text, and ”see also”.

There are two parts in a semantic network: the nodes and relationships between. The nodes are from three sections in each Wikipedia article: the title, the hyperlinks in the main text, and the hyperlinks in the ”see also” section. These hyperlinks in the main text are chosen as nodes since they are verified concepts in Wikipedia and indicate explicit associations between concepts as they occur with other concepts in the same articles.

The relationships are assumed to be established between two concepts if they co-occur in the same article. Two different criteria are applied for the raw weight accumulation of each relationship: since there is a large number of concepts in the main text, if two concepts co-occur in the main text, the weight is assigned a lower value to avoid dominant concepts; the concept in ”see also” are intrinsically strong associations but with less amount compared with the concepts in the main text which is assigned a higher weight. The choice of different weight assignment is determined based on experimental results: if two concepts co-occur in the main text, the weight will be added with one; if two concepts occur in the ”see also”, the weight will be added with nine. The raw weight is accumulated and stored for later filtering. In this way, the nodes appearing in one article will be interlinked. Taking the content in Figure 1 as an example, the nodes are “fastText”, “word embeddings” “Facebook” “unsupervised learning”, “supervised learning”, “Word2vec”, “Glove”, “Neural Network”, “Natural Language Processing”. The relationships are established between each pair of nodes because they co-occur in the same article. In this way, a network can be constructed by processing all articles in Wikipedia’s database.

3.3 Construction of edge weights

After the extraction process, an initial network with nodes and edges can be constructed. In the semantic network, explicit knowledge associations are direct edges linking pairs of nodes, and implicit knowledge associations are paths consisting of multiple edges, which means an implicit knowledge association is essentially a concatenation of a series of interconnected explicit knowledge associations (Shi et al., 2017). To evaluate the correlation degree of implicit knowledge associations, the weight of explicit knowledge associations should be quantified.

3.3.1 Semantic cosine similarity weight

In the construction process, the explicit associations are built based on the interlinked concepts within pages and the corresponding raw weights are statistically calculated. These statistical relationships construct the basic edges in a semantic network from a statistical perspective, which provides the foundation for WikiLink and statistical intuition for information retrieval. While in design activities, the semantic relationship also contributes much to design innovation such as analogy and metaphor methods (Hey et al., 2008; Linsey et al., 2012) from a semantic perspective. Inspired by the implication of semantic relationship in design innovation activities, the statistical association between two concepts can be combined and balanced with the semantic similarity for boosting design innovation. The semantic similarity can be obtained by transforming all words to vectors and calculating the semantic cosine similarity between these vectorized concepts. The conventional word embedding methods like Word2Vec train a unique word embedding for every individual word. However, Wikipedia contains a large number of terms, with some of them even being new terms out of vocabulary. FastText (Bojanowski et al., 2017; Joulin et al., 2016) can solve this issue by treating each word as the aggregation of its subwords. The vector for a word is simply taken to be the sum of all vectors of its component char-ngrams. In this way, fastText can obtain vectors even for out-of-vocabulary (OOV) words, or the new terms in Wikipedia, by summing up vectors for its component char-ngrams, provided that at least one of the char-ngrams was present in the training data. When all concepts have been represented as word vectors, all edges connecting two nodes are assigned with a value by calculating the semantic cosine similarity between these vectors.

3.3.2 Global normalization and local normalization

In many design models, the design innovation process usually involves two important phases: divergence and convergence. For example, there are rounds of divergent and convergent phases in the ”double diamond” design process model (The Design Council, 2017). Divergence is a phase that encourages exploring different solutions as much as possible while convergence follows a particular set of logical steps to arrive at one solution which in some cases is a ”correct” solution. Inspired by the principles of divergence and convergence, the retrieval behaviors can be facilitated in two distinct ways: a ”general” and ”specific” ways. ”General” means the nodes are common and basic concepts with a relatively general meaning, which tends to lead divergent thinking in a design innovation process. While ”specific” means the nodes are detailed and domain-specific concepts, which has higher potential to guide convergent thinking. The ”general” and ”specific” retrieval are realized by normalizing the raw weight with a globalization method as shown in equation (1) and a localization method as shown in equation (2):

w_{i j}^{g} = (w_{i j} - w_{m i n}) / (w_{m a x} - w_{m i n})

(1)

w_{i j}^{l} = w_{i j} / S_{i}

(2)

where w_max and w_min are the maximum and minimum value of the raw weight in the whole network. w_ij is the raw weight between the node i and node j, S_i is the sum value of all raw weights of edges around node i.

The global normalization performs feature scaling normalization from a global perspective, in which w_ij^g expresses the significance of the strength compared to the whole network. Global normalization tends to retrieve more ”general” concepts(Shi et al., 2017). The local normalization performs feature scaling normalization from a local perspective, in which w_ijⁱ expresses the relative importance of the strength compared to its own adjacent value. Local normalization tends to extract more domain-specific concepts.

3.3.3 Geometric mean and harmonic mean

Since an implicit knowledge association is essentially a concatenation of a series of explicit associations, the accumulation of the strength of contained explicit associations (edges) can potentially indicate the correlation degree of the implicit association (path). Therefore, in order to reflect the overall strength of all the explicit associations in an arbitrary implicit association, the retrieval behaviors can be facilitated in two distinct ways: one type of retrieval, referred to as ”basic”, is a short implicit association across fewer edges focusing on relevant concepts which tend to be in the same domain while another type, referred to as ”professional”, is a long implicit association with more edges across multiple distant domains. Therefore, the geometric mean (GM) and the harmonic mean (HM) are applied on the normalized weights for different design innovation behaviors.

The geometric mean(GM) and harmonic mean(HM) are given in equation (3) and (4) respectively:

GM: w_{(k_{1} - k_{2} - \dots - k_{n + 1})} = \sqrt[n]{n \prod k = 1 w_{k, k + 1}}

(3)

HM: w_{(k_{1} - k_{2} - \dots - k_{n + 1})} = \frac{n}{\sum_{k = 1}^{n} \frac{1}{w_{k, k + 1}}}

(4)

where the w_{(k₁-k₂-…-k_n+1)} is the overall weight of the path, w_k,k+1 is each weight along the path.

3.4 Four algorithms for design innovation

The primary use of the design semantic network is to retrieve relevant knowledge concepts and relationships for design innovation. In addition to retrieving around a single concept, retrieving the implicit associations between two distant knowledge concepts is also introduced. Four algorithms are developed by applying the normalization and mean methods to the proposed retrieval approach. The four algorithms, which are ”Explore-General”, ”Explore-Specific”, ”Search Path-Basic” and ”Search Path-Professional” are applied as four functions in WikiLink.

Figure 2: Four functions in the panel of WikiLink

The ”Explore” algorithm is used to explore and retrieve around a single knowledge concept. The retrieved results can be classified as either ”general” or ”specific”. The ”Explore” function panel in WikiLink is shown in Figure 2. Specifically, since it is preferred to retrieve both ”general” and ”specific” knowledge concepts related to a query, we apply two different normalization algorithms with distinct retrieval behaviours in this ”Explore” function. One is global normalization to retrieve ”general” concepts for divergence, and the another is local normalization to retrieve ”specific” concepts for convergence. The overall weight is calculated on a combination of the statistical weight and the semantic weight. The algorithm for ”Explore-General”, and ”Explore-Specific” are given in equations (5) and (6) respectively:

w_{e x p l o r e}^{g e n e r a l} = 0.3 \times (1 - w_{s e m a n t i c}) + 0.7 \times w^{g}

(5)

\begin{matrix} w_{e x p l o r e}^{s p e c i f i c} = 0.2 \times (1 - w_{s e m a n t i c}) + 0. 8 \times w^{l} \end{matrix}

(6)

where the w_semantic is the semantic cosine similarity weight, w^g is the statistical weight after global normalization and w^l is the statistical weight after local normalization.The weight in the algorithm are determined based on experimental results.

The ”Explore” algorithms are further combined with the single source Dijkstra’s shortest path algorithm, which starts from the source query to retrieve all reachable nodes in order from the shortest distance. In addition, a ”Minimum Step” functionality is provided on the ”Explore” panel, where knowledge associations with edges less than the number of the defined minimum step are filtering out for paths with fewer steps. Therefore, the knowledge associations are retrieved and ranked under the combined weight with the minimum step.

The ”Search Path” algorithm is used to find implicit associations as paths are given two knowledge concepts. The retrieval result can be classified as either ”basic” or ”professional”, where ”basic” means the path is short and nodes are general concepts while ”professional” means the paths are long and nodes are domain-specific concepts. The ”Search Path” function panel in WikiLink is shown on the right side of Figure 2. Specifically, besides two different normalization algorithms, the geometric mean(GM) is further applied to retrieve short implicit associations across fewer edges focusing on relevant knowledge while harmonic mean(HM) is applied to retrieve long implicit associations with more edges across multiple domains.

The algorithm of ”Search Path-Basic” and ”Search Path-Professional” are given in equations (7)(8) and (9)(10) respectively:

w_{(k_{1} - k_{2} - \dots - k_{n + 1})}^{b a s i c} = \sqrt[n]{n \prod k = 1 w_{k, k + 1}^{b a s i c}}

(7)

\begin{matrix} w_{k, k + 1}^{b a s i c} = 0.3 \times (1 - w_{s e m a n t i c}) + 0.7 \times w^{g} \end{matrix}

(8)

w_{(k_{1} - k_{2} - \dots - k_{n + 1})}^{p r o f e s s i o n a l} = \frac{n}{\sum_{k = 1}^{n} \frac{1}{w_{k, k + 1}^{p r o f e s s i o n a l}}}

(9)

\begin{matrix} w_{k, k + 1}^{p r o f e s s i o n a l} = 0.2 \times (1 - w_{s e m a n t i c}) + 0.8 \times w^{l} \end{matrix}

(10)

where the w_semantic is the semantic cosine similarity weight, w^g is the statistical weight after global normalization and w^l is the statistical weight after local normalization.

4 Evaluation

In this section, we conduct four studies on WikiLink to demonstrate its effectiveness and usefulness. Some other semantic networks, which are publicly accepted or aiming for design innovation, are selected as benchmarks during the comparison, including B-link, WordNet and ConceptNet. The evaluation is conducted from four perspectives, i.e., coverage of concepts, coverage of relationships, coverage of disciplines, term-to-term evaluation and effectiveness of combined relationships to provide an overview of the strengths and weaknesses of WikiLink.

4.1 Coverage of golden concepts

In order to demonstrate the feasibility of WikiLink, golden concepts, which are composed of words and terms, are defined as the benchmark to evaluate WikiLink’s term coverage. The golden concepts are collected manually within an online source Encyclopedia Britannica through several steps. Firstly featured concepts are obtained from its website. There are several categories of topics available concerning different domains, including culture, science, and technology. By gathering these classified words and terms, it is ensured that the collected data contains interdisciplinary knowledge. The original data is refined afterward by removing uncommon expressions and standardizing their formats. The aim of this step is to assure the precision and impartiality of the following evaluations. Eventually, we obtain a list of 468 words and terms, covering knowledge in 8 domains, and part of the concepts are shown in Table 1.

\topruleCategories	Related concepts
\midruleAnimal	bird, chordate, coral, insect, sea otter, …
Art	acting, ballade, chinese literature, emmy award, film, …
Event	american civil war, bronze age, cold war, french revolution, hurricane katrina, …
Place	africa, anatolia, berlin, cape town, indonesia, …
Plant	carnivorous plant, venus flytrap, …
Science	atmosphere, brain, carbohydrate, chemistry, disease, ….
Sports	athletics, boxing, gymnastics, rugby, …
Technology	airplane, bicycle, industry, radar, smartphone, supercomputer, …
Topic	accident, architecture, buddhism, cbs corporation, democracy, …
\bottomrule

Table 1: The overview of golden concepts

With these golden concepts, we then evaluate how many concepts are contained in WikiLink. The retrieval rate $C_{R}$ , as shown in equation (11), is applied as the metric of concept retrieval:

C_{R} = \frac{n_{C}}{N_{C}}

(11)

where $n_{C}$ means how many concepts are contained in the network, while $N_{C}$ represents the number of golden concepts, which is 468 is this case.

WordNet and ConceptNet are used as two benchmarks for evaluation. It is observed that WordNet only contains 209 concepts, resulting in a low $C_{R}$ rate of 0.449. The specific $C_{R}$ values of different categories are shown separately in Table 2, from which we notice that WikiLink gives the highest retrieval rate, indicating that our network has a wider coverage of concepts compared with the other tools considered.

\topruleCategories	WordNet	ConceptNet	WikiLink
\midruleTotal Rate CR	0.449	0.810	0.938
art	0.386	0.818	0.841
animal	1.000	1.000	1.000
event	0.037	0.630	0.963
place	0.602	1.000	1.000
plant	0.333	1.000	1.000
science	0.631	0.954	0.954
sports	0.652	0.957	0.913
technology	0.636	0.818	0.909
topic	0.287	0.638	0.920
\bottomrule

Table 2: Retrieving results of golden concepts

To be specific, our approach involves more concepts in most categories and achieves the highest retrieval rate. In comparison, WordNet shows overall weaknesses, due to its inadequacy in processing two-word terms. ConceptNet has decent performance in the fields of art, science, sports, and technology, but it lacks strengths in certain categories such as topics and events.

This result can be explained by the limitation of ConceptNet’s construction properties. Even though the data source of ConceptNet includes two-word terms, such as stained glass, chemical element, and mental disorder, these terms are mostly composed of one adjective and one noun. Except for names of countries and regions, seldom are two-noun terms involved in ConceptNet. Based on our observation, plenty of concepts in those two categories, i.e., topics and events, are composed of more than one noun, e.g., teacher education, Paris agreement, and pacific crest trail, which are exactly situations that ConceptNet lacks solution to. This explains ConceptNet’s low $C_{R}$ rate for those two categories. In contrast, our approach can deal with various kinds of terms, which explains its overall high coverage. This high coverage of concepts can support design innovation with a large concept space.

4.2 Coverage of golden relationships

A list of golden relationships is selected from the data source as the evaluation benchmark to quantitatively evaluate the performance of relationship coverage. Similar to the construction process of WikiLink, we extracted concept relationships from Encyclopedia Britannica’s spotlight articles. Only those which are composed of golden concepts are retained. We randomly picked 1000 concept pairs from the retained ones and defined as golden relationships.

Denoting golden relationships as set $H$ , we compare the performance of WikiLink with other tools in terms of the coverage of golden relationships. In this process, we retrieve all relationships between golden concepts from each tool, and denote these retrieved relationships as set $V$ . The evaluation metric is defined as follows:

\begin{matrix} R = \frac{| V \cap H |}{| H |} \end{matrix}

(12)

where $R$ indicates the retrieving rate of relationships. WordNet and ConceptNet are chosen as benchmarks, and the results are shown in the Table 3.

\topruleCategories	Count	R
\midruleWordNet	15	0.015
ConceptNet	170	0.170
WikiLink	721	0.721
\bottomrule

Table 3: Evaluation results of golden relationships

Specifically, 15 relationships are retrieved from WordNet, which belong to golden relationships, leading to a significantly low $R$ value of only 0.015. This retrieving rate can be explained by WordNet’s data structure. To our knowledge, WordNet only retrieves specific relationships, including ”synonyms”, ”sister terms”, ”hypernyms”, and ”hyponyms”, between two concepts, which leads to its huge deficiency in context association and results in a low retrieving rate.

The web API of ConceptNet is used to retrieve concepts and relationships. It turns out that there are 170 relationships which are found in the golden relationships, resulting in a $R$ value of 0.170. The retrieving rate can be understood from two perspectives. ConceptNet’s network contains more concepts than WordNet, which can be observed from its $C_{R}$ value. In addition, it provides richer explanations for ”relationships”. In other words, as well as ”synonyms” and ”hypernyms”, ConceptNet is also able to retrieve ”related terms” and ”terms with this context” for an arbitrary single concept. These two reasons both contribute to its retrieving rate.

In the end, 721 relationships can be retrieved from the golden relationships within WikiLink. This can be explained by its largest number of concepts, and the relationships in our approach are defined differently, i.e., they are established between concepts that are shown on the same pages. To summarize, WikiLink achieves a retrieving rate of 0.721 and shows the best performance. This high retrieving rate of relationships builds enough associations which can potentially contribute to design innovation.

4.3 Coverage of categories

To prove that WikiLink covers a wide range of categories, we categorize and count all the nodes in WikiLink according to Wikipedia’s category rules. Wikipedia defines 13 main categories: cultural, geography, health, history, human, mathematics, natural, people, philosophy, religion, society, technology, and reference. By traversing all the items’ categories in WikiLink, the distribution of the 13 categories is presented in Figure 3.

Figure 3: The distribution of concepts in WikiLink

It can be seen from the graph that WikiLink’s data have a wide distribution among 13 main categories, and the count of a particular main category can reach up to 100,000 level and even higher. Especially, the natural, people and reference categories have the largest counts, which are 1,241,491, 1,161,583 and 1,222,966 respectively. Rather than focusing only on technological and scientific knowledge, WikiLink is a more generic semantic network, with knowledge from a wide coverage of disciplines, which can be used in daily design innovation activities to obtain inspiration. Specifically, the data source of B-link mainly comes from scientific papers, which leads to the uneven distribution of each discipline, while WikiLink has a wide range of information in different fields and disciplines. Compared with TechNet, the result of WikiLink shows higher diversity as the distribution of TechNet is highly correlated with the distribution of patents, which may affect the inspiration of the design because of the coverage limitation, even though it contains a large number of domains within technology fields.

4.4 Term to term evaluation

To evaluate whether the computed edge weights are consistent with human judgement, thirty term pairs (three groups and each ten as a group) representing various degrees of relevance were prepared by language experts and ten students were employed to rate the relevance of each pair. The students scored semantic relevance and statistical relevance on a five-scale from one (not related) to five (highly related), and the average of scores is computed for each pair. The semantic relevance and statistical relevance are then combined as the weight in ”Explore-General” algorithm. In this evaluation, only the ”Explore-General” edge weights in the four algorithms is evaluated since the weight calculation in the four algorithms is all similar.

With the evaluation results, Cronbach’s alpha is used to measure the inter-rater reliability which is 0.78 as an acceptable result. Spearman’s rank correlation coefficient is then used to assess the relationship between computed edge weights and human judgments. Table 4 shows the result of the Spearman rank correlation coefficients between the pairwise association values of the same term pairs.

\topruleGroup Number	Spearman Correlation
\midrule1	0.69
2	0.89
3	0.64
\bottomrule

Table 4: Term to term evaluation results

The hypothesis of Spearman correlation coefficient is then tested to determine whether the results are statistically significant. By checking the table of critical values, the three groups’ Spearman’s rho are all greater than the critical value 0.57 (one tail, $α$ =0.05), so the null hypothesis is rejected. This proves that there is a strong correlation between the computed edge weights and human judgments , upheld by a significance level of 95%.

4.5 Effectiveness of combined relationships

As introduced in section 3, the statistical relationships between two concepts are established if they co-occur in the same article. Constructing the basic connection from a statistical perspective only could potentially lead to a phenomenon that the retrieval is dominated by some highly common concepts. These dominating common concepts decrease the retrieval probability of other useful concepts for design innovation. However, using semantic relationships only as the weight of edges is beneficial for design but might require in a longer association for implicit knowledge discovery. The semantic relationships are thus incorporated to balance the statistical relationship. To demonstrate the effectiveness of the proposed weight fusion, three types of retrieval results based on different relationships (networks with combined relationships, with statistical relationships, and with semantic relationships) are compared. The concept “health” is chosen for the “Explore” function and the concept pair “health & 3d printing” is chosen for the “Search Path” function.

\toprule	Combined relationship	Statistical relationship	Semantic relationship
\midruleBasic	health $\to$ economics $\to$ Massachusetts Institute of Technology $\to$ 3D printing	health $\to$ education $\to$ United States $\to$ The New York Times $\to$ artificial intelligence $\to$ 3D printing	health $\to$ health care $\to$ palliative care $\to$ intensive care unit $\to$ 3D printing
Professional	health $\to$ construction $\to$ ladder $\to$ 3D printing	health $\to$ physical fitness $\to$ physical strength $\to$ eccentric contraction $\to$ weight plate $\to$ knurling $\to$ deep drawing $\to$ hydroforming $\to$ direct metal laser sintering $\to$ rapid prototyping $\to$ 3D printing	health $\to$ health care $\to$ palliative care $\to$ intensive care unit $\to$ 3D printing
\bottomrule

Table 5: The high-correlated knowledge associations between ”health” and ”3d printing” with three different relationships

Figure 4: Retrieval results for ”health” with three different relationships

\topruleCategory	”Explore-General”	”Explore-Specific”
\midruleStatistical relationship	536	63
Combined relationship	308	32
\bottomrule

Table 6: The average node degree of retrieval results for ”health” with two different relationship

\topruleCategory	”Search Path-Basic”	”Search Path-Professional”
\midruleStatistical relationship	565	139
Combined relationship	473	131
\bottomrule

Table 7: The average node degree of knowledge associations between ”health” and ”3d printing” with two different relationship

Figure 4 and Table 5 are the results of “Explore” and “Search Path” respectively. It can be seen that the results of “Explore” and “Search Path” with statistical relationship have more concepts which contain common and general meaning but are irrelevant with “health” semantically, e.g., “United States” and “United Kingdom” which are dominant nodes in this case. Conversely, the results of the two functions with semantic relationships contain more relevant concepts but only show the semantic relevance to “health” (e.g., “environmental health” and “health care”). The combined relationship makes a balance between the statistical relationship and semantic relationship so that it produces a relatively positive result. The node degree of a concept means the sum of weights of all edges incident to that node. The average node degree of concepts are calculated in combined relationships and statistical relationships to demonstrate whether the very common results are balanced quantitatively. Table 6 and Table 7 shows that, in four functions, the average node degrees of concepts with combined relationship are all observably lower than that of concepts with statistical relationship, which imply that the semantic relationship balances the statistical relationship to retrieve valuable information. Both the quantitative and qualitative results indicate that the combined relationship is efficient to reduce the influence of dominant concepts with high node degree in retrieval results thus could facilitate design innovation activities.

5 Demonstration

In this section, we showcase four functions in WikiLink for information retrieval and design innovation. Qualitative analysis of the results is performed to demonstrate the features of each function. In addition, the performance of WikiLink is compared with four state-of-the-art tools, and the corresponding results are also analyzed qualitatively.

Figure 5: Retrieval results for ”3d printing” and ”fused deposition modeling” in ”Explore-General” and ”Explore-Specific” mode

5.1 The ”Explore-General” and ”Explore-Specific” mode

To fairly compare the performance of ”Explore-General” and ”Explore-Specific” modes, two terms in the field of engineering design are chosen: ”3d printing” and ”fused deposition modeling”. 3D printing is a multi-faceted technology and has been employed across a broad range of applications (Berman, 2012), and is a widely used term with general meanings. Fused deposition modeling (FDM) is a 3D printing method that heats a continuous thermoplastic filament and extrudes it for layer-by-layer deposition (Hamzah et al., 2018), which is also a widely used term with specific meanings. These two terms are inputted and explored in ”general” and ”specific” modes, respectively. Figure 5 shows the top 10 relevant terms in each retrieval. By comparing the ”general” results (the first row) with the ”specific” results (the second row), it can be seen that the terms in ”general” results are more common and comprehensible, such as computer-aided design, and artificial intelligence, while the terms in ”specific” results, such as stl (file format) and polyetherimide, are normally very specific concepts in particular domains. Furthermore, as the figure shows, FDM’s specific result is centered scattering. It implies that primary terms in a particular domain are discrete and irrelevant to each other.

\toprule	Brain & Computer	Avocado & Chair
\midrule\multirow3*Basic	brain $\to$ artificial intelligence $\to$ computer	avocado $\to$ fruit $\to$ furniture $\to$ chair

	brain $\to$ biology $\to$ computer	avocado $\to$ walnut $\to$ furniture $\to$ chair
\multirow5*Professional	brain $\to$ neuroscience $\to$ psychology $\to$ science $\to$ technology $\to$ computer	avocado $\to$ guacamole $\to$ burrito $\to$ xylitol $\to$ product call $\to$ ikea $\to$ rocking chair $\to$ chair

	brain $\to$ neuroscience $\to$ psychology $\to$ science $\to$ technology $\to$ internet $\to$ computer	avocado $\to$ guacamole $\to$ taco $\to$ hockey puck $\to$ potato chips $\to$ ladder $\to$ rocking chair $\to$ chair
\bottomrule

Table 8: The high-correlated two types of knowledge associations

5.2 The ”Search Path-Basic” and ”Search Path-Professional” mode

The ”Search Path” function allows users to explore the implicit associations between two items even from different domains. It also has two modes that can return two types of associations. In order to test the above two modes, we used two pairs of terms, ”brain” and ”computer”, which are weakly related, and ”avocado” and ”chair”, which are seemingly unrelated. Table 8 shows the retrieved highest-correlated ”basic” and ”professional” knowledge associations of the two pairs. Obviously, the ”basic” paths are shorter and the ”professional” paths are longer. Besides, most of the nodes in ”basic” paths are concepts with general meanings between the two domains, such as artificial intelligence, fruit and furniture, while the ”professional” path is longer and the nodes are almost scientific terms or specific objects such as ”neuroscience”, ”xylitol” and ”guacamole”. Some explicit associations are discovered in the results. For example, brain science drives the advance of computer science, especially artificial intelligence, which appears in the path ”brain → artificial intelligence → computer”. In addition, more implicit associations are connected by some surprising concepts like ”fruit”, ”furniture” and ”rocking chair”, which may remind the idea of fruit-shaped furniture such as an avocado-shaped rocking chair. It is found that, in some cases, purely statistical weights between edges result in a longer and more surprising path which may inspire more innovative ideas in design activities.

Figure 6: Comparisons of the results from ”Explore” and ”Search path”

5.3 The ”Explore” and ”Search Path” function

The above shows that the ”Explore” function aims to discover the knowledge associations around a single term, while the ”Search Path” function aims to search for the associations between two terms. To clarify the difference between them, a hot concept in engineering design, ”metaverse”, was explored along with two weakly related terms separately: ”shopping” and ”meeting”. Retrieval experiments were conducted in ”Explore-Specific” and ”Search Path-Professional” mode respectively. As shown in Figure 6, the retrieval results of ”metaverse” cover a wide range of fields including ”virtual world”, ”simulated reality”, ”cyberspace” and related games including ”Second Life” and ”Active Worlds”. These wide results can lead to comprehensive knowledge discovery and an open imagination about the target term. On the other hand, the paths between ”metaverse” and a selected concept focus on bridging the fields that connect them, which leads to combinational ideas. For instance, the nodes linking ”metaverse” and ”shopping” are related to ”virtual economy”, and the nodes linked ”metaverse” and ”meeting” are related to virtual society.

\toprule	Neural Network	Trypsin
\midruleWikiLink (general)	deep learning, google, c++, linux, cross-platform, javascript, open-source software, operating system, perl	amino acid, pancreas, enzyme, transcription(genetics), translation(genetics), base pair(genetics), life, active site, translation(biology), stroke
WikiLink (specific)	classification rule, deep learning, cognitive model, stockfish(chess), machine learning, black box, Hebbian learning, list of memory biases, deepmind, artificial neuron	phenylisothiocyanate, myotoxin, triosephosphateisomerase, zymogen, tandem mass spectrometry, peptide mass fingerprinting, ligase, dihydrofolate reductase, pepsin, papain
B-link (general)	genetic algorithm, optimization, fuzzy logic, classification, pattern recognition, artificial neural network, multi-objective optimization, simulated annealing, simulation, response surface methodology	chymotrypsin, protease, pepsin, purification, thrombin, digestion, characterization, expression, synthesis, crystal structure
B-link (specific)	backpropagation, genetic algorithm, fuzzy logic, self-organizing map, multilayer perceptron, backpropagation algorithm, neuro-fuzzy, pattern recognition, neuro-fuzzy system, artificial intelligence	chymotrypsin, enzyme thermostability, modified enzyme, pepsin, protease-activated receptor-2, protease-activated receptor, digestive protease, pyloric caecum, carboxypeptidase a, viscera
ConceptNet	neural net; autoencoder, backpropagation, catastrophic interference, computational intelligence, condela, convolutional neural network, dropout, hidden layer	antitrypsin, antitryptic, apronitin, chymotrypsin, endopeptidase, enterokinase, meromyosin, mesotrypsin, ovoinhibitor, ovomucin
WordNet	neural net, computer architecture, network of neurons, network of nuclei	enzyme, pancreas, protein, polypeptide units
TechNet	artificial neural network, machine learning, training data, pattern recognition, hidden layer, layer node, upper hidden layer, neuron, residual activation, automobile overspeed, vehicular safety sensor, time many	Proteolytic enzyme, pepsin trypsin, subtilisin family, bromelain ficin, proteolytic, no amidolytic, enzymatic, amidolytic, protease, trypsin thrombin plasmin, dynorphin targeting moiety, irtx
\bottomrule

Table 9: The top 10 related terms to ”neural network” and ”trypsin” in WikiLink and the four benchmark tools

5.4 Comparison with benchmark tools

We undertook a retrieval comparison between WikiLink and the other four benchmark tools. The target terms are ”neural network” in computer science and ”trypsin” in medical physiology. The aim of this experiment is to test whether our network can return a broad range of related terms which are able to stimulate innovation in the design process efficiently. Since the number and presentation of retrieval results vary from tool to tool, we have selected the top 10 related terms for each tool to present in Table 9. Especially, the results of WikiLink and B-link (Shi et al., 2017) were obtained through their ”Explore” function, the result of ConceptNet (Speer et al., 2017) was obtained from its ”Related terms” category, the result of WordNet (Fellbaum, 2010) was obtained from its ”Synset” and ”Example sentence” functions. According to Table 9, the terms retrieved by WikiLink in the ”general” and ”specific” modes both prove the effectiveness of the ”Explore” function. For example, the retrieval results of ”neural network” in the ”specific” mode are all domain-specific terms related to the components (e.g., ”artificial neuron”), functions (e.g., ”cognitive model”) and applications (e.g., ”deep learning”) of ”neural network”. Since the ”Explore” function of WikiLink is divided into the ”Explore-General” and ”Explore-Specific” modes, its results, containing common terms (from the ”Explore-General” mode) and technical terms (from the ”Explore-Specific” mode), cover a comprehensive range. In contrast, ConceptNet, WordNet and TechNet simply have only one retrieval mode, which leads to their retrieved results invariably focus on some technical terms in a specific range. Even though B-Link retrieves in the two modes as WikiLink, its results are also limited by the data source which are engineering academic papers and design websites. It can be seen that the retrieved terms of B-Link tend to contain specific meanings. Instead, WikiLink applies Wikipedia as the data source for its semantic network, which covers information from a wide range of domains. The comparison suggests that WikiLink is more capable of retrieving terms in various domains, which is essential for knowledge discovery in the knowledge-intensive design innovation process.

5.5 A design case

A designer is recruited to conduct a design case and demonstrate the process of applying WikiLink for design innovation. Generally, the designer would be initially given a design question with a ”Basic word”, and then required to apply the ”Explore” function and ”Search path” function in WikiLink to freely explore the related concepts around the ”Basic word” which could potentially inspire the designer. By applying the ”Explore function”, the designer could discover the knowledge concepts ”C1”, ”C2”, ”C3” around the ”Basic word” as denoted in Figure 7. While the ”Search path” function provides the paths, e.g Path_C1C2 between two terms ”C1” and ”C2” for combinational creativity (Han et al., 2019). This process can be iteratively applied to discover knowledge associations and paths such as ”C3” and Path_C1C3. The related concepts obtained from WikiLink are then used to form design inspiration links such as ”Basic_word-C1, Path_C1C3, Path_C2C3” and some of them are eventually chosen for the design output of specific design ideas.

Figure 7: The flow of concepts exploration in WikiLink

A real design case is conducted to illustrate how to facilitate design innovation with WikiLink. Since ”hair dryer” is a well-known product of which the homogenization in the market is serious and its innovation has encountered a bottleneck, the designer is required to generate ideas and provide innovative designs for hair dryer. The concept ”hair dryer” is chosen as the design query (also known as the basic word) in WikiLink in this case. The designer then started with the ”Explore” function by freely choosing several different step lengths and switching between general and specific mode for divergent and convergent thinking. Some screenshot examples are shown in Figure 8. It is noted that the designer is not restricted to using ”hair dryer” as the query only. After the initial exploration in WikiLink, the designer obtained some interesting and inspiring concepts, such as ”Entertainment weekly”, ”Vacuum cleaner”, ”Comb”,”Hair iron”,” Hair gel”,” Hair roller”,” Hot comb”,” Horn” and ”Pyramid”. The next step is to apply the ”Search Path” function by freely querying the paths between two concepts of the designer’s interests. Some retrieval results are shown in Table 10.

Figure 8: The examples of concepts retrieved by ”Explore”

\topruleQuery concepts	Mode	Retrieval results
\midrulehair dryer & entertainment weekly	Basic	hair dryer $\to$ vacuum cleaner $\to$ automobile $\to$ united states $\to$ entertainment weekly
hair dryer & entertainment weekly	Professional	hair dryer $\to$ hair iron $\to$ natural hair movement $\to$ afro $\to$ tie-dye $\to$ zardozi $\to$ choli $\to$ crop top $\to$ the face (magazine) $\to$ arena (magazine) $\to$ loaded (magazine) $\to$ fhm’s 100 sexiest women (uk) $\to$ fhm $\to$ maxim (magazine) $\to$ people (magazine) $\to$ entertainment weekly
hair dryer & tie-dye	Basic	hair dryer $\to$ vacuum cleaner $\to$ automobile $\to$ textile $\to$ tie-dye
\bottomrule

Table 10: The paths between the inspiring concepts and ”hair dryer” retrieved by ”Search Path”

Afterwards, the designer continued to explore knowledge concepts for design innovation stimuli by iteratively using the ”Explore” and ”Search path” functions. The ”Explore” function helps discover the knowledge associations around a single term, while the ”Search path” function can potentially look for the associations between two terms. The designer recorded all the interesting and inspiring concepts and formed the ”design inspiration links”, as shown in Fig 9, where the base of the link is ”Hair Dryer”, and rest of the concepts were from WikiLink obtained by using ”Explore” and ”Search path” functions. The above process was repeated to produce at least one design inspiration link and until the designer thought it is enough to formulate design ideas. Eventually, with the ideas originating from the concepts in the inspiration link, the designer produced the final complete design scheme and drew corresponding design sketches.

Figure 9: The example of “design inspiration link”

In particular, we use Figure 10 and 11 are the designs produced with the inspiration links ”Hair dryer” – ”Comb” – ”Hairstyle” - ”Tie-dye” – ”Zardozi”.

Figure 10: The sketch of hair dryer inspired by ”Tie-dye” and ”Zardozi”

Figure 11: The sketch of hair dryer inspired by ”Comb”

In particular, two ideas were generated during the designer’s manipulation with WikiLink. The first design, as shown in Figure 10, is an appearance design inspired by ”Tie-dye” and ”Zardozi”. The existing hair dryers in the market is mostly in pure color with a smooth or frosted plastic shell. ”Tie-dye”, the characteristic of the Bai nationality, has special patterns which are uneven in-depth and rich in layers, and overcomes the rigidity of pure color. ”Zardozi”, a traditional Chinese craft, has a delicate touch feeling compared with plastic material. Thus ”Tie-dye” and ”Zardozi” inspire the designer to integrate traditional Chinese cultural elements into the design of hair dryer to increase cultural connotation. The second design (Figure 11) is functional and inspired by ”Hairstyle” and ”Comb” in the design inspiration link. The idea is to design the replaceable hair dryer nozzle with the features of ”Comb” so that users can comb their hair conveniently while drying the hair without searching it in a hurry.

6 Conclusion

In this research, firstly, a semantic network for design innovation is constructed. Wikipedia is applied as the data source for the semantic network. During the construction, the Wikipedia items are regarded as the nodes, the interlinks between the items on the same page are regarded as the directly connected relationship (edges) between nodes. The evaluation result indicate that the network contains information from a wide range of fields and expands the data to a new boundary. Secondly, instead of simply one type of weight, a combined weight is introduced for the relationship in the semantic network. The combined weight fuses the statistical relationship and semantic relationship which better captures the implicit connection between concepts for design innovation. Four algorithms are further developed to retrieve relevant knowledge concepts and relationships with different levels and manners. Thirdly, the constructed semantic network for design innovation is further developed as a tool, called WikiLink. An evaluation and demonstration for WikiLink are conducted subsequently. Compared with other benchmarks, with the fusion of semantic meaning weight and statistical weight, WikiLink can well balance the breadth and depth in exploring knowledge for design innovation. A design case is conducted to demonstrate the process of how WikiLink can facilitate idea generation. The results indicate that WikiLink can serve as a design ideation tool for design innovation.

The study leaves some space for future research though it provides a functional panel for practical use. The weight strength fusing two types of weight is one of the main contributions in this research, but it only shows the numerical value and lacks of explicit semantic meaning describing the relationship between two concepts. Thus a semantic description is expected to be added to the edges in WikiLink and provide richer information for design innovation. Besides, the network visualization of WikiLink is currently designed on a two-dimensional scale, which might cause information explosion when the retrieved network keeps growing. A three-dimensional scale network along with other information visualization techniques could be a solution and provide a more dynamic way for users to explore information and obtain inspiration more effectively.

References

R. L. Ackoff (1989) From data to wisdom. Journal of applied systems analysis 16 (1), pp. 3–9. Cited by: §2.2.
S. S. Bae, O. Kwon, S. Chandrasegaran, and K. Ma (2020) Spinneret: aiding creative ideation through non-obvious concept associations. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, pp. 1–13. Cited by: §2.3.
B. Berman (2012) 3-d printing: the new industrial revolution. Business horizons 55 (2), pp. 155–162. Cited by: §5.1.
P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov (2017) Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics 5, pp. 135–146. Cited by: §3.3.1.
H. Cheong, W. Li, A. Cheung, A. Nogueira, and F. Iorio (2017) Automated extraction of function knowledge from text. Journal of Mechanical Design 139 (11). Cited by: §2.1.
P. R. Childs (2013) Mechanical design engineering handbook. Butterworth-Heinemann. Cited by: §1, §2.1.
H. Ernst (2003) Patent information for strategic technology management. World patent information 25 (3), pp. 233–242. Cited by: §2.2.
C. Fellbaum (2010) WordNet. In Theory and applications of ontology: computer applications, pp. 231–243. Cited by: §2.3, §5.4.
K. Fu, J. Cagan, K. Kotovsky, and K. Wood (2013) Discovering structure in design databases through functional and surface based mapping. Journal of mechanical Design 135 (3), pp. 031006. Cited by: §2.2.
T. Furukawa, K. Mori, K. Arino, K. Hayashi, and N. Shirakawa (2015) Identifying the evolutionary process of emerging technologies: a chronological network analysis of world wide web conference sessions. Technological Forecasting and Social Change 91, pp. 280–294. Cited by: §2.2.
G. V. Georgiev and D. D. Georgiev (2018) Enhancing user creativity: semantic measures for idea generation. Knowledge-Based Systems 151, pp. 1–15. Cited by: §2.2, §2.3.
H. Geschka (1983) Creativity techniques in product planning and development: a view from west germany. R&D Management 13 (3), pp. 169–183. Cited by: §2.1.
A. K. Goel (1997) Design, analogy, and creativity. IEEE expert 12 (3), pp. 62–70. Cited by: §2.3.
S. R. Gorti, A. Gupta, G. J. Kim, R. D. Sriram, and A. Wong (1998) An object-oriented representation for product and design processes. Computer-aided design 30 (7), pp. 489–501. Cited by: §2.2.
H. H. Hamzah, S. A. Shafiee, A. Abdalla, and B. A. Patel (2018) 3D printable conductive materials for the fabrication of electrochemical sensors: a mini review. Electrochemistry Communications 96, pp. 27–31. Cited by: §5.1.
J. Han, H. Forbes, F. Shi, J. Hao, and D. Schaefer (2020) A data-driven approach for creative concept generation and evaluation. In Proceedings of the Design Society: DESIGN Conference, Vol. 1, pp. 167–176. Cited by: §2.3.
J. Han, M. Hua, F. Shi, and P. R. Childs (2019) A further exploration of the three driven approaches to combinational creativity. In Proceedings of the Design Society: International Conference on Engineering Design, Vol. 1, pp. 2735–2744. Cited by: §5.5.
J. Han, F. Shi, L. Chen, and P. R. Childs (2018a) A computational tool for creative idea generation based on analogical reasoning and ontology. AI EDAM 32 (4), pp. 462–477. Cited by: §2.1.
J. Han, F. Shi, L. Chen, and P. R. Childs (2018b) The combinator–a computer-based tool for creative idea generation based on a simulation approach. Design Science 4. Cited by: §2.1, §2.3.
J. Hao, Y. Yan, L. Gong, G. Wang, and J. Lin (2014) Knowledge map-based method for domain knowledge browsing. Decision Support Systems 61, pp. 106–114. Cited by: §1.
Y. He, B. Camburn, H. Liu, J. Luo, M. Yang, and K. Wood (2019a) Mining and representing the concept space of existing ideas for directed ideation. Journal of Mechanical Design 141 (12). Cited by: §2.2.
Y. He, B. Camburn, H. Liu, J. Luo, M. Yang, and K. Wood (2019b) Mining and representing the concept space of existing ideas for directed ideation. Journal of Mechanical Design 141 (12). Cited by: §2.3.
J. Hey, J. Linsey, A. M. Agogino, and K. L. Wood (2008) Analogies and metaphors in creative design. International Journal of Engineering Education 24 (2), pp. 283. Cited by: §3.3.1.
T. J. Howard, S. J. Culley, and E. Dekoninck (2008) Describing the creative design process by the integration of engineering design and cognitive psychology literature. Design studies 29 (2), pp. 160–180. Cited by: §2.1.
A. Ivanov and D. Cyr (2014) Satisfaction with outcome and process from web-based meetings for idea generation and selection: the roles of instrumentality, enjoyment, and interface design. Telematics and Informatics 31 (4), pp. 543–558. Cited by: §2.1.
N. R. Johnson (1992) Metaphor and design. Studies in Art Education 33 (3), pp. 144–153. Cited by: §2.3.
A. Joulin, E. Grave, P. Bojanowski, M. Douze, H. Jégou, and T. Mikolov (2016) Fasttext. zip: compressing text classification models. arXiv preprint arXiv:1612.03651. Cited by: §3.3.1.
H. Kim and K. Kim (2012) Causality-based function network for identifying technological analogy. Expert Systems with Applications 39 (12), pp. 10607–10619. Cited by: §2.3.
H. Kwon, Y. Park, and Y. Geum (2018a) Toward data-driven idea generation: application of wikipedia to morphological analysis. Technological Forecasting and Social Change 132, pp. 56–80. Cited by: §2.1.
H. Kwon, Y. Park, and Y. Geum (2018b) Toward data-driven idea generation: application of wikipedia to morphological analysis. Technological Forecasting and Social Change 132, pp. 56–80. Cited by: §2.2.
X. Li, Q. Xie, T. Daim, and L. Huang (2019) Forecasting technology trends using text mining of the gaps between science and technology: the case of perovskite solar cell technology. Technological Forecasting and Social Change 146, pp. 432–449. Cited by: §2.2.
J. Linsey, A. Markman, and K. Wood (2012) Design by analogy: a study of the wordtree method for problem re-representation. Cited by: §3.3.1.
Y. Liu, A. Chakrabarti, and T. Bligh (2003) Towards an ‘ideal’approach for concept generation. Design studies 24 (4), pp. 341–355. Cited by: §2.1.
J. Luo, S. Sarica, and K. L. Wood (2021) Guiding data-driven design ideation by knowledge distance. Knowledge-Based Systems 218, pp. 106873. Cited by: §2.1.
T. McCaffrey and L. Spector (2018) An approach to human–machine collaboration in innovation. AI EDAM 32 (1), pp. 1–15. Cited by: §2.2.
D. Munoz and C. S. Tucker (2016) Modeling the semantic structure of textually derived learning content and its impact on recipients’ response states. Journal of Mechanical Design 138 (4), pp. 042001. Cited by: §2.2.
A. F. Osborn (1953) Applied imagination.. Cited by: §2.1.
Y. Rezgui, S. Boddy, M. Wetherill, and G. Cooper (2011) Past, present and future of information and knowledge sharing in the construction industry: towards semantic service-based e-construction?. Computer-Aided Design 43 (5), pp. 502–515. Cited by: §2.2.
S. Sarica, J. Luo, and K. L. Wood (2020) TechNet: technology semantic network based on patent data. Expert Systems with Applications 142, pp. 112995. Cited by: §2.2.
S. Sarica and J. Luo (2021) Design knowledge representation with technology semantic network. Proceedings of the Design Society 1, pp. 1043–1052. Cited by: §2.3.
S. Sarica, B. Song, E. Low, and J. Luo (2019a) Engineering knowledge graph for keyword discovery in patent search. In Proceedings of the Design Society: International Conference on Engineering Design, Vol. 1, pp. 2249–2258. Cited by: §2.3.
S. Sarica, B. Song, J. Luo, and K. L. Wood (2021) Idea generation with technology semantic network. AI EDAM, pp. 1–19. Cited by: §2.3.
S. Sarica, B. Song, J. Luo, and K. Wood (2019b) Technology knowledge graph for design exploration: application to designing the future of flying cars. In International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, Vol. 59179, pp. V001T02A028. Cited by: §2.3.
F. Shi, L. Chen, J. Han, and P. Childs (2017) A data-driven text mining and semantic network analysis for design information retrieval. Journal of Mechanical Design 139 (11). Cited by: §2.1, §2.2, §2.3, §3.3.2, §3.3, §5.4.
N. Shibata, Y. Kajikawa, Y. Takeda, and K. Matsushima (2008) Detecting emerging research fronts based on topological measures in citation networks of scientific publications. Technovation 28 (11), pp. 758–775. Cited by: §2.2.
J. F. Sowa (1987) Semantic networks. Cited by: §2.2, §2.3.
R. Speer, J. Chin, and C. Havasi (2017) Conceptnet 5.5: an open multilingual graph of general knowledge. In Thirty-first AAAI conference on artificial intelligence, Cited by: §2.3, §5.4.
The Design Council (2017) What is the framework for innovation? design council’s evolved double diamond. External Links: Link Cited by: §3.3.2.
A. B. VanGundy (1988) Techniques of structured problem solving. Springer. Cited by: §2.1.

	”3d printing”	”fused deposition modeling”
General
Specific


(a) The ”Explore Specific” results for ”metaverse”	(b) The ”Search Path-Professional” results between ”metaverse” and ”shopping”	(c) The ”Search Path-Professional” results between ”metaverse” and ”meeting”


(a) ”Explore-General” results with one step for ”hair dryer”	(b) ”Explore-General” results with two steps for ”hair dryer”	(c) ”Explore-Specific” results with two step for ”hair dryer”