@@ -33,11 +33,11 @@ For fine grained relations this split is generated:
<center>
set | sentences | percentage
----| ---- | ----
train | 56884 | 51%
validation | 16685| 15%
test | 37817 | 34%
set | sentences fine | percentage (fine) | sentences coarse | percentage (coarse)
----| ---- | ---- | ---- | ----
train | 56884 | 51% | 39222 | 51%
validation | 16685| 15% | 11500 | 15%
test | 37817 | 34% | 25806 | 34%
</center>
@@ -61,7 +61,7 @@ Variations:
-**epochs**: 1 to 4
<br>
Visualized results:
Visualized results for fine grained relations:
<center>
@@ -71,12 +71,20 @@ Visualized results:
</center>
For each batch size a similar behavior can be seen. While the training loss is the highest after the first epoch of training it is reduced to under 0.1 in every scenario after the second epoch. The training loss gets even more reduced with more epochs, whereas the validation loss is quite high after one epoch, remains on the same level only between the first and second epoch and increases after that. This is a sign of overfitting, which means that our trained model corresponds too closely to the train data. In our case the increase of the validation loss is not drastic but shows that two epochs are sufficient for training our model. <br>
If the three plots are compared one can see that we get the best results for a **batch size of 32** and a **learning rate of 3e-05**.
For each batch size a similar behavior can be seen. While the training loss is the highest after the first epoch of training it is reduced to under 0.1 in every scenario after the second epoch. The training loss gets even more reduced with more epochs, whereas the validation loss is quite high after one epoch, remains roughly on the same level only between the first and second epoch and increases after that. This is a sign of overfitting, which means that our trained model corresponds too closely to the train data. In our case the increase of the validation loss is not drastic but shows that two epochs are sufficient for training our model. <br>
The accuracy the model achieves stays more or less the same over all epochs and for every batch size. <br>
If the three plots are compared one can see that we get the best results looking at a combination of validation loss (1.89) and accuracy (0.66) for a **batch size of 32** and a **learning rate of 3e-05**. <br>
The same goes for the coarse grained relations. A comparable behavior can be seen in the plot below.
<center>

</center>
## Testing
For the testing task we used a model which is trained with the parameters already mentioned above `learning_rate=3e-5, epochs=1, batch_size=32` respectively for fine grained and coarse grained relations, which both have an accuracy of around 0.66 according to the validation. <br>
For the testing task we used a model which is trained with the parameters already mentioned above (`learning_rate=3e-5, epochs=1, batch_size=32`) respectively for fine grained and coarse grained relations, which both have an accuracy of around 0.66 according to the validation. <br>
Our fine-tuned BERT model was tested on different testing data sets - each one with a slight variation.<br>
Apart from the regular sentences containing the nominal compound we changed the sentences to ones