686
Views
0
CrossRef citations to date
0
Altmetric
Research Article

MoCoUTRL: a momentum contrastive framework for unsupervised text representation learning

ORCID Icon, , , &
Article: 2221406 | Received 20 Feb 2023, Accepted 31 May 2023, Published online: 16 Jun 2023
 

Abstract

This paper presents MoCoUTRL: a Momentum Contrastive Framework for Unsupervised Text Representation Learning. This model improves two aspects of recently popular contrastive learning algorithms in natural language processing (NLP). Firstly, MoCoUTRL employs multi-granularity semantic contrastive learning objectives, enabling a more comprehensive understanding of the semantic features of samples. Secondly, MoCoUTRL uses a dynamic dictionary to act as the approximately ground-truth representation for each token, providing the pseudo labels for token-level contrastive learning. The MoCoUTRL can extend the use of pre-trained language models (PLM) and even large-scale language models (LLM) into a plug-and-play semantic feature extractor that can fuel multiple downstream tasks. Experimental results on several publicly available datasets and further theoretical analysis validate the effectiveness and interpretability of the proposed method in this paper.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported by Defense Industrial Technology Development Program: [Grant Number JCKY2020601B018].