Explain plots in README (22ccde5f) · Commits · burkhardt / Lsem-RC in nominal compounds

probing/README.md

+3 −3

Original line number	Diff line number	Diff line
		@@ -215,11 +215,11 @@ Clustering the verbs for a specific relation. In `all_guesses_cluster` all resul

		### Baselines

		The baselines used were a majority baseline, a random prediction, and a random classification weighted by category frequency. The majority baseline was implemented as the verb that occurred most frequently in the gold standard: affect (relation: OBJECTIVE). The random prediction was randomly drawn from the set of gold verbs - implemented both as a draw from a uniform distribution as well as a draw weighted by occurrence in the gold standard. It was drawn only from the set of gold verbs and not from the sets of all, respectively, the most frequent English verbs; not to mention a random draw from all English words. Thus, these random baselines are still far above what BERT would predict by a completely naive, semantics-illiterate prediction.
		The baselines used were a majority baseline, a random prediction, and a random classification weighted by category frequency. The majority baseline was implemented as the verb that occurred most frequently in the gold standard: affect and done (relations: OBJECTIVE and CAUSAL). The random prediction was randomly drawn from the set of gold verbs - implemented both as a draw from a uniform distribution as well as a draw weighted by occurrence in the gold standard. It was drawn only from the set of gold verbs and not from the sets of all, respectively, the most frequent English verbs; not to mention a random draw from all English words. Thus, these random baselines are still far above what BERT would predict by a completely naive, semantics-illiterate prediction.

		#### Fine Grained

		The results below indicate that BERT's prediction for the fine-grained relations is significantly better than a uniform random prediction as well as a stratified random prediction. However, the prediction is below the majority baseline. Even though, the most frequent category accounts for only 6,6% of the data.
		The results below indicate that BERT's prediction for the fine-grained relations is significantly better than a uniform random prediction as well as a stratified random prediction. However, the prediction is below the majority baseline. Even though, OBJECTIVE, the most frequent relation represented by the gold verb affect, accounts for only 6,6% of the data.

		<div align="center">

		@@ -235,7 +235,7 @@ The results below indicate that BERT's prediction for the fine-grained relations


		#### Coarse Grained
		Consistent with the results of the significance tests for the fine-grained relations, the coarse-grained prediction is significantly better than the two randomized baselines, but is below the majority baseline. Even though, the most frequent relation CAUSAL with the gold verb x accounts for only 12.6% of the data.
		Consistent with the results of the significance tests for the fine-grained relations, the coarse-grained prediction is significantly better than the two randomized baselines, but is below the majority baseline. Even though, the most frequent relation CAUSAL with the gold verb done accounts for only 12.6% of the data.

		<div align="center">