Commit eef2239f authored by luetzel's avatar luetzel
Browse files

updated HOWTO-ZENITH.md

parent f3a20007
Loading
Loading
Loading
Loading
+30 −6
Original line number Diff line number Diff line
# How to run the code (Zenith)

## 0. Preparing enviroment for Zenith
## 1. Preparing enviroment for Zenith

1. Our ensembling shell script requires a python3.6 virtual enviroment named zenithenv, placed in `/swp-metaphors/zenith/`. 
	- If you are interested in running an ensemble model execute these commands in `/swp-metaphors/zenith/`
@@ -14,13 +14,35 @@
      `pip install -r requirements.txt`.


## 1. Training Zenith
## 2. Preparing data

First load all files stored in the `zenith/metaphor-detection/data/vua` directory of the external data download into the same directory in this repository.
To prepare the data run `python data_preparation.py vua`. It creates the following files in `/swp-metaphors/zenith/metaphor-detection/data/vua/`:

* `VUA_corpus_train.csv`, `VUA_corpus_val.csv`, `VUA_corpus_test.csv`: The different dataset splits
* `all_pos_test_tokens.pkl`, `verb_test_tokens.pkl`: Contain offsets of test tokens corresponding to a sentence
* `elmo_train.pkl`, `elmo_val.pkl`, `elmo_test.pkl`: Computed ELMo embeddings for sentences in the different dataset splits


## 3. Training Zenith

1. Navigate to the desired zenith folder in `zenith/metaphor-detection/code`. 

    `code/` contains subfolders for the different architectures derived from the original baseline by [Kumar and Sharma](https://github.com/Kumar-Tarun/metaphor-detection). The subfolders are five models named `baseline`, `cn-features`, `concat`, `nb-only` and `glove-only` (MODELNAME) and each consist of four files:

    * `main_vua.py` trains one seed for the corresponding model
    * `model.py` contains the model classes and defines the architecture for the corresponding model
    * `programm.sh` trains seven seeds with varying parameters for the corresponding model
    * `util.py` supportive utility class for the corresponding model containing all helper functions 
2.
    - For ensembling: Execute the shell script with the following command:  
      `TODO`
      `./programm.sh MODELNAME`
    - For a single seed model: Execute the `main_vua.py` script.
      Example: 
      ```
      python main_vua.py --epochs 7 --dropout1 0 --dropout2 0.1 --dropout3 0.5 --losslit 1.2 --lossmet 1.8
      ```


    Model(s) | Files needed |
    --- | --- |
@@ -32,11 +54,13 @@

3. After the training and testing is complete, the model is saved in `zenith/metaphor-detection/models/MODELNAME/` and VUA predictions in the shared task format can be found in `zenith/metaphor-detection/predictions/MODELNAME/`.

## 2. Evaluating Zenith
## 4. Evaluating Zenith

1. When using ensembling, run the majority_vote.py script in /zenith/metaphor-detection/predictions/MODELNAME/
1. When using ensembling, run the majority_vote.py script in `/zenith/metaphor-detection/predictions/MODELNAME/`
2. Run [automatic_evaluation.py](analysis/scripts/automatic_evaluation.py) with the following arguments:
    - `--pred_label_file`: the file created in the previous step.
    - `--gold_label_file`: the VUA test gold labels ([all_pos_tokens.csv](data/vua/test_gold_labels/all_pos_tokens.csv) for evaluation
      on VUA All-POS, [verb_tokens.csv](data/vua/test_gold_labels/verb_tokens.csv) for evaluation on VUA Verbs).
3. Stats and scores will be printed to the console.

The ensemble prediction files created in the first step can be found in `/zenith/prediction_ensemble/` for each of the models.
 No newline at end of file