@@ -57,13 +57,13 @@ sentence_similarity.py is a basic script which loads a BERT model and the STS-B
### Masking
To recreate our masking experiments, import probing to SBERT_Model.py.
When loading the dataset, call one of the methods in probing.py (e.g. mask_token(“NOUN”)) and assign it to the dataset variable.
POS-sensitive methods will require 1 argument of type string, POS-non-sensitive methods should be called without any arguments.
The string should be a valid POS like “NOUN” or “ADJ”.
When loading the dataset, call one of the methods in probing.py (e.g. `mask_token("NOUN")`) and assign it to the dataset variable.
POS-sensitive methods will require one argument of type string, POS-non-sensitive methods should be called without any arguments.
The string should be a valid POS like "NOUN" or "ADJ".
The methods will return the changed dataset.
Example:
dataset = probing.mask_token(“VERB”)
`dataset = probing.mask_token("VERB")`
### Sentence length
sentence_length.py tests how the length of sentences influence how well similarity can be embedded. You can simply run the script without any changes or parameters. It will print some statistics concerning sentence lengths and the performance evaluation on data sets with different sentence lengths.