Created by: Xirider
Removed hacks that turn "fixed" lr_schedule to "inverse_sqrt". These seem to come from the default setting in the scheduler setup. By changing this default to "inverse_sqrt" they are not necessary anymore. I checked internal as well, with its own PR.
Tested with opt_baseline.py and sweep_baseline.py with default settings: The results are the same and the setting for sweep_baseline "polynomial_decay" is correctly propagated and overriding the default.
Issue: https://github.com/facebookresearch/metaseq/issues/434