@@ -32,4 +32,25 @@ The best way to run experiments with generated transcripts is to:
## data pre-processing
Data pre-processing was done as explained in fairseq speech-to-text module. Note that if you create your own datasets and want to use a NMT expert, you need to process the target transcripts and translations in the speech translation/recognition dataset the same way you processed the data for the NMT task.
Scripts to extract source transcripts and target translations from the csv Datafiles created by the speech-to-text pre-processing are included.
For COVOST2 and MUST-C:
1. adapt the file paths in [get_src_to_st.py](fairseq/get_src_to_st.py) to fit your setup and simply run `python get_src_to_st.py`.
2. adapt the file paths in [get_source_text.py](fairseq/examples/speech_to_text/get_source_text.py) to your setup and run `python get_source_text.py`
3. the extracted data files are saved in `${dataset_name}/${split_name}`
4. process the extracted text data the same you did for your NMT expert, e.g. by adapting [prepare-rest.sh](fairseq/examples/speech_to_text/prepare-rest.sh)
5. run `python get_source_text.py` again
6. adapt the configuration files to point to your NMT expert's vocabulary and BPE.
## model training
Model training is done as is specified in the fairseq framework.