I modified the reward function using different parameters of the sacrebleu....
I modified the reward function using different parameters of the sacrebleu. And from a prior test, seems to be improved the learning.
Loading
Please sign in to comment