Created by: sriniiyer
Added a new argument to restrict generation to a fixed number of tokens
Tested with
CUDA_VISIBLE_DEVICES=0,1 FSD=/fsx-mudslide/rpasunuru/data/instruct-opt/prompt_data/allbenchmarks_io_streaming_after_dedup_v5_sorted \
python metaseq_internal/scripts/eval/schedule_jobs_few_shot_instruct_opt.py \
-t opti_valid_data_to_text__unified_skg__totto__prompt0 \
-m 1.3B_gptz \
-o /fsx-mudslide/sviyer/tmp_100/ \
--slurm-partition instruct-opt \
--batch-size 8 --override-completed --n-eval-samples 100 --max-gen-tokens 100 --local
- Confirmed that it is generating 100 tokens for each example with this argument, and without it, it generates > 1800 tokens per argument.
- Conputed rouge metrics and 100-tokens = 0.0472 2048 tokens = 0.0047
Also putting out a PR for MSI