Created by: stephenroller
Patch Description Right now, the dummy lm batch size for validation is quite large (same as the training set). It takes a long while to step through, especially when there are multiple validation shards.
This patch shrinks the valid/test sets to a single batch. Additionally, makes the dataset aware of --batch-size-valid
.
Testing steps Manual testing