Created by: stephenroller
Patch Description There are some adversarial setups where a user could produce a long prompt (2040 prompt, 8 gen) and another user could produce a very short prompt with long generation (8 prompt, 2040 gen). These two might get batched together because individually they're below 2048. However, once padded, it becomes an input of 4080. This can cause one of the two outputs to be returned incorrect.y
Testing steps Manual testing