Biomedical Word Sense Disambiguation with Contextualized Representation Learning

Date:

Representation learning is an important component in solving most Natural Language Process- ing (NLP) problems, including Word Sense Disambiguation (WSD). The WSD task tries to find the best meaning in a knowledge base for a word with multiple meanings (ambiguous word). WSD methods choose this best meaning based on the context, i.e., the words around the am- biguous word in the input text document. Thus, word representations may improve the effec- tiveness of the disambiguation models if they carry useful information from the context and the knowledge base. Most of the current representation learning approaches are that they are mostly trained on the general English text and are not domain specified. In this paper, we present a novel contextual-knowledge base aware sense representation method in the biomedical domain. The novelty in our representation is the integration of the knowledge base and the context. This representation lies in a space comparable to that of contextualized word vectors, thus allowing a word occurrence to be easily linked to its meaning by applying a simple nearest neighbor ap- proach. Comparing our approach with state-of-the-art methods shows the effectiveness of our method in terms of text coherence