WiC: The Word-in-Context Dataset (English)

A reliable benchmark for the evaluation of context-sensitive word embeddings

(New!) [XL-WiC] - WiC in 12 other languages!

Depending on its context, an ambiguous word can refer to multiple, potentially unrelated, meanings. Mainstream static word embeddings, such as Word2vec and GloVe, are unable to reflect this dynamic semantic nature. Contextualised word embeddings are an attempt at addressing this limitation by computing dynamic representations for words which can adapt based on context.

A system's task on the WiC dataset is to identify the intended meaning of words. WiC is framed as a binary classification task. Each instance in WiC has a target word w, either a verb or a noun, for which two contexts are provided. Each of these contexts triggers a specific meaning of w. The task is to identify if the occurrences of w in the two contexts correspond to the same meaning or not. In fact, the dataset can also be viewed as an application of Word Sense Disambiguation in practise.

WiC features multiple interesting characteristics:

  • It is suitable for evaluating a wide range of applications, including contextualized word and sense representation and Word Sense Disambiguation;
  • It is framed asa binary classification dataset, in which, unlike Stanford Contextual Word Similarity (SCWS), identical words are paired with each other (in different contexts); hence, a context-insensitive word embedding model would perform similarly to a random baseline;
  • It is constructed using high quality annotations curated by experts.


Participate in WiC's CodaLab competition: submit your results on the test set and see where you stand in the leaderboard!
Link: WiC CodaLab Competition

WiC is featured as a part of the SuperGLUE benchmark.

WiC was also used for a shared task at SemDeep-5 IJCAI workshop.

Dataset details

Please see the following paper:

Examples from the dataset

Label Target Context-1 Context-2
F bed There's a lot of trash on the bed of the river I keep a glass of water next to my bed when I sleep
F land The pilot managed to land the airplane safely The enemy landed several of our aircrafts
F justify Justify the margins The end justifies the means
T beat We beat the competition Agassi beat Becker in the tennis championship
T air Air pollution Open a window and let in some air
T window The expanded window will give us time to catch the thieves You have a two-hour window of clear weather to finish working on the lawn


Sentence-level contextualised embeddings Implementation Accuracy %
SenseBERT-large Levine et al (2019) 72.1
KnowBERT-W+W Peters et al (2019) 70.9
RoBERTa Liu et al (2019) 69.9
BERT-large Wang et al (2019) 69.6
Ensemble Gari Soler et al (2019) 66.7
ELMo-weighted Ansell et al (2019) 61.2
Word-level contextualised embeddings Implementation Accuracy %
WSD Loureiro and Jorge (2019) 67.7
BERT-large WiC's paper 65.5
Context2vec WiC's paper 59.3
Elmo WiC's paper 57.7
Sense representations
LessLex Colla et al (2020) 59.2
DeConf WiC's paper 58.7
SW2V WiC's paper 58.1
JBT WiC's paper 53.6
Sentence level baselines
Sentence Bag-of-words WiC's paper 58.7
Sentence LSTM WiC's paper 53.1
Random baseline 50.0
† Use external knowledge resources

Performance upperbound

Accuracy %
Human-level performance 80.0


This dataset is licensed under a Creative Commons Attribution-NonCommercial 4.0 License.