Created by: awkrail
Patch Description I got an email from Meta to download OPT175B weights, and tried to run it by following projects/OPT/download_opt175b.md. When resharding the shards, I got an error as follows:
ERROR: The function received no value for the required argument: input_glob_pattern
Usage: reshard_fsdp.py INPUT_GLOB_PATTERN OUTPUT_SHARD_NAME <flags>
optional flags: --num_output_shards | --unflatten_weights |
--skip_optimizer_state
The main reason of this error is lacking the backslash in the scripts. I added it to the scripts as follows:
for j in {0..7}; do
python -m metaseq.scripts.reshard_fsdp \ # <- I added this backslash
--input-glob-pattern "/groups/gcb50205/actx/opt_175b/checkpoint_last-model_part-$j-shard*.pt" \
--output-shard-name "/groups/gcb50205/actx/opt_175b_parallel/checkpoints/reshard-model_part-$j.pt" \
--num-output-shards 1 --skip-optimizer-state True --unflatten-weights True
done
Testing steps After adding it to the scripts, the error was solved and I could run the script.