Visible to the public ES2Vec: Earth Science Metadata Keyword Assignment using Domain-Specific Word Embeddings

TitleES2Vec: Earth Science Metadata Keyword Assignment using Domain-Specific Word Embeddings
Publication TypeConference Paper
Year of Publication2020
AuthorsRamasubramanian, Muthukumaran, Muhammad, Hassan, Gurung, Iksha, Maskey, Manil, Ramachandran, Rahul
Conference Name2020 SoutheastCon
Keywordsclassifier, compositionality, Geoscience, Keyword Classification, machine learning, metadata, Metadata Discovery Problem, natural language processing, Neural Network, pubcrawl, resilience, Resiliency, Scalability, Semantics, Task Analysis, Tools, user interfaces, Word2Vec
AbstractEarth science metadata keyword assignment is a challenging problem. Dataset curators select appropriate keywords from the Global Change Master Directory (GCMD) set of keywords. The keywords are integral part of search and discovery of these datasets. Hence, selection of keywords are crucial in increasing the discoverability of datasets. Utilizing machine learning techniques, we provide users with automated keyword suggestions as an improved approach to complement manual selection. We trained a machine learning model that leverages the semantic embedding ability of Word2Vec models to process abstracts and suggest relevant keywords. A user interface tool we built to assist data curators in assignment of such keywords is also described.
DOI10.1109/SoutheastCon44009.2020.9249743
Citation Keyramasubramanian_es2vec_2020