Created by: suchenzang
Continuing to break down https://github.com/facebookresearch/metaseq/pull/197
- Removed
scale_fc
,scale_attn
, andscale_heads
flags that were brought in as part of Normformer - might have to bring scale_fc back later, but hoping to clean up configuration / flags before then. - Removed
sync_ln_variance
which was brought in to speed up NormFormer when we went tensor parallel.