Employing word-embedding for schema matching in standard lifecycle management
Loading...
Author/Creator
Author/Creator ORCID
Date
2024-03-01
Type of Work
Department
Program
Citation of Original Publication
Oh, Hakju, Boonserm (Serm) Kulvatunyou, Albert Jones, and Tim Finin. “Employing Word-Embedding for Schema Matching in Standard Lifecycle Management.” Journal of Industrial Information Integration 38 (March 1, 2024): 100547. https://doi.org/10.1016/j.jii.2023.100547.
Rights
This work was written as part of one of the author's official duties as an Employee of the United States Government and is therefore a work of the United States Government. In accordance with 17 U.S.C. 105, no copyright protection is available for such works under U.S. Law.
Public Domain
Public Domain
Abstract
Today, businesses rely on numerous information systems to achieve their production goals and improve their global competitiveness. Semantically integrating those systems is essential for businesses to achieve both. To do so, businesses must rely on standards, the most important of which are data exchange standards (DES). DES focus on technical and business semantics that are needed to deliver quality and timely products and services. Consequently, the ability for businesses to quickly use and adapt DES to their innovations and processes is crucial. Traditionally, information standards are managed and used 1) in a platform-specific form and 2) usually with standalone and file-based applications. These traditional approaches no longer meet today's business and information agility needs. For example, businesses now must deal with companies and suppliers that use heterogeneous syntaxes for their information. Syntaxes that are optimized for individual but have different objectives. Moreover, file-based standards and the usage specifications derived from the standards cause inconsistencies since there is neither a single standard format for each usage specification nor a single source of truth for all of them. As the number and types of information systems grow, developing, maintaining, reviewing, and approving standards and their derived usage specifications are becoming more difficult and time consuming. Each file-based usage specification is typically based on a different syntax than the standard syntax. As a result, each usage specification must be manually updated as the standard evolves; this can cause significant delays and costs in adopting the new and better standard versions. National Institute of Standards and Technology (NIST) in collaboration with the Open Application Groups Inc. (OAGi) has developed a web-based standard lifecycle management tool called SCORE to address these problems. The objective of this paper is to introduce the SCORE tool and discuss its particular functionality where a word-embedding technique has been employed along with other schema-matching approaches. Together they can assist standard users in updating the usage specification due to the release of new version of a standard leading to faster adaptations of DES to new processes.