Created by: suchenzang
Cleanup part... 10? Changes here include:
- removed
set_beam_size
andreorder_incremental_state_scripting
methods inincremental_decoder.py
- moved
TransformerEncoder
andEmbedding
(moved to modules/) to separate files, renamedtransformer.py
->transformer_decoder.py
- deleted unused
model_utils.py
file - moved ffn and transformer encoder layer to separate files
Used a 125m baseline as a test run to confirm parity.