Created by: suchenzang
Splitting https://github.com/facebookresearch/metaseq/pull/231 up into 2 PRs.
Removed:
- unused
moe_disable_padding
arg - unused
from_pretrained
methods, since we currently depend onload_model_ensemble_and_task
fromcheckpoint_utils
(not great but saving that for another PR) - unused
hub_models
method