Commit 8c030f7e authored by pirapakaran's avatar pirapakaran
Browse files

Update project_report.md

parent ef276e5d
Loading
Loading
Loading
Loading
+3 −2
Original line number Diff line number Diff line
@@ -43,13 +43,14 @@ After the human needs have been successfully assigned for all essays, we evaluat
  <img src="pictures/Recall_test.png" />
</p>

> The system scored a macro-averaged precision of roughly 19% for Maslow categories, while scoring only 5.4% for Reiss categories. Yet again the system struggles much more with the classification of the correct Reiss motive due to their fine-grained nature. The possibility to choose from a much wider range of possible classes makes negative impacts on the systems performance. Again, nature of the Reiss motives too negatively influences the system's performance due to aforementioned reasons. However, provided with a much more limited variety of classes to choose from, the system's performance increases heavily, as is visible in the precision scores of Maslow categories. It is interesting to take a quick glance at the scores for each class: As we had expected, classes that are barely seen in the gold data produce poor results due to their lack of generalization. In our case, "physiological needs" scored 0% precision because of this. On the other hand, classes which are more frequent in the gold data score much better results, as visible by the results for the classes "spiritual growth" and "stability". These results confirm our study conjectures about the frequency of certain classes, in which frequent classes scored significantly higher than classes which are barely existent in the gold data. Classes in the middle of this range provide expected results, in which they don't score very high but also not dramatically low.
>> Though the results may again seem utterly dissatisfying and not quite what we had hoped for, it is still somewhat accaptable due to the more difficult and tricky nature of our texts and the approach we had to take. The problems that arise from the lack of gold data for certain classes (i.e. "physiological needs") will have to be mitigated either way. A more balanced distribution of texts could possibly solve this problem, but the content of the corpus at use could make this approach quite difficult, as it was the case.

<p align="center">
  <img src="pictures/Precision_test.png" />
</p>

> The system scored a macro-averaged precision of roughly 19% for Maslow categories, while scoring only 5.4% for Reiss categories. Yet again the system struggles much more with the classification of the correct Reiss motive due to their fine-grained nature. The possibility to choose from a much wider range of possible classes makes negative impacts on the systems performance. Again, nature of the Reiss motives too negatively influences the system's performance due to aforementioned reasons. However, provided with a much more limited variety of classes to choose from, the system's performance increases heavily, as is visible in the precision scores of Maslow categories. It is interesting to take a quick glance at the scores for each class: As we had expected, classes that are barely seen in the gold data produce poor results due to their lack of generalization. In our case, "physiological needs" scored 0% precision because of this. On the other hand, classes which are more frequent in the gold data score much better results, as visible by the results for the classes "spiritual growth" and "stability". These results confirm our study conjectures about the frequency of certain classes, in which frequent classes scored significantly higher than classes which are barely existent in the gold data. Classes in the middle of this range provide expected results, in which they don't score very high but also not dramatically low.
>> Though the results may again seem utterly dissatisfying and not quite what we had hoped for, it is still somewhat accaptable due to the more difficult and tricky nature of our texts and the approach we had to take. The problems that arise from the lack of gold data for certain classes (i.e. "physiological needs") will have to be mitigated either way. A more balanced distribution of texts could possibly solve this problem, but the content of the corpus at use could make this approach quite difficult, as it was the case.

<p align="center">
  <img src="pictures/Matrix_test_1.png" />
</p>