Created by: ruanslv
Fixes https://github.com/facebookresearch/metaseq/issues/529, where the problem is very well documented.
1/ best_of
should control beam size, while n
controls number of generations to be returned.
2/ nbest
doesn't do anything, the number of generations returned is controlled in hub_utils.