Science Research Management ›› 2024, Vol. 45 ›› Issue (2): 176-188.DOI: 10.19571/j.cnki.1000-2995.2024.02.018

Previous Articles     Next Articles

Identification of technology opportunities based on the LDA model and co-occurrence network dynamic analysis

Wang Jinfeng1,2, Zhang Zhixin1, Feng Lijie1,3, Zhang Ke4   

  1. 1. School of Management, Zhengzhou University, Zhengzhou 450001, Henan, China;
    2. China (Shanghai) Institute of FTZ Supply Chain, Shanghai Maritime University, Shanghai 201306, China; 
    3. School of Logistics Engineering, Shanghai Maritime University, Shanghai 201306, China;
    4. Information Management School, Zhengzhou University, Zhengzhou 450001, Henan,China
  • Received:2022-05-10 Revised:2023-05-15 Online:2024-02-20 Published:2024-01-23
  • Supported by:
    Innovation Method Fund of China;Joint Funds of the National Natural Science Foundation of China;Shanghai Science and Technology Program

Abstract:    The external environment for future technological development is characterized by high uncertainty and increasing technological complexity. It is crucial to adopt a scientific and efficient analytical approach to identify technological innovation opportunities. In the era of data science, scientific knowledge is experiencing explosive growth, making it increasingly difficult to evaluate and predict technological trends. Previous studies relying on qualitative or static analysis is no longer sufficient to accurately identify technical opportunities. In order to make informed decisions about scientific development policies, mitigate investment risks, and accurately grasp the direction of scientific development, it is necessary to increase research efforts on the scientific knowledge network and attempt to mine potential knowledge through the co-occurrence network. Therefore, this paper has constructed a path of technology opportunity identification from a dynamic analysis perspective that integrates patent text mining and keyword co-occurrence networks. Firstly, this paper obtained and preprocessed patent data, and then applied the LDA model to extract technology topics and keywords from the patent data of specific technology domains. In particularly, the LDA model operation was performed using the jieba word splitting tool in Python software and the scikit-learn library. This requires setting the parameters α and β separately and obtaining the optimal topic number K by calculating the perplexity. In the meanwhile, the TF-IDF indicator was also introduced for importance analysis. Secondly, the paper constructed the overall co-occurrence matrix of technology keywords and the co-occurrence matrix based on technology topics with time windows. Subsequently, the overall co-occurrence network of technology keywords and the keyword co-occurrence sub-network based on technology topics were generated based on the co-occurrence matrix. In particularly, this paper systematically investigated and judged the more important and influential technologies in a specific technology domain by constructing an overall co-occurrence network of technology keywords and calculating the co-occurrence intensity of technology keywords. In the meanwhile, the trends in the evolution of technical topics and keywords over time were further analyzed through co-occurrence sub-networks. This requires the generation of a dynamically changing co-occurrence sub-network using the well-established keyword lexicon and the co-occurrence matrix containing time windows. It visualized the keywords and their linkage relationships across years in the same three-dimensional coordinate system.Thirdly, high-frequency technology keywords were identified based on the overall co-occurrence network, while the evolution process of technology topics and keywords were analyzed based on the co-occurrence sub-network. And then, technology keywords were classified into four types: sustained, declining, emerging, and abandoned. Among them, sustained keywords are those with high co-occurrence intensity and a stable upward trend over time; declining keywords are those with high co-occurrence intensity but a decreasing trend over time; emerging keywords are those with low co-occurrence intensity but a stable upward trend over time; and abandoned` keywords are those with low co-occurrence intensity and a decreasing trend over time. In turn, sustained and emerging were identified from these four types as technology opportunities with growth potential. Finally, using unmanned ships as an example, the technology opportunities contained in the power supply technology topic, attitude measurement technology topic, and positioning and navigation technology topics were identified. For example, sustained technology opportunities such as path planning and sensors, emerging technology opportunities such as lidar and inertial navigation were identified. This paper not only addressed the limitations of static co-occurrence networks in revealing the dynamic evolution process of technological domains, but also avoided the problem of covering up or misjudging some technology opportunities in the innovation process. This will provide a useful decision-making reference for companies to efficiently identify technological innovation opportunities.

Key words: technology opportunity identification, LDA model, co-occurrence network, dynamic analysis, unmanned ship