Created by: sniklaus
Thank you for sharing your implementation and contributing to the area of video frame interpolation!
I am the first author of SepConv and am afraid that the quantitative results in the table do not accurately reflect the performance of SepConv. Specifically, they state the results for the version of SepConv that is trained to produce perceptually good results which perform subpar when using them in a quantitative benchmark. As such, I have extended the table with the results for the more appropriate version.
On a side note, I am not a fan of using motion masks for the quantitative benchmark since they ignore possible artifacts in the regions outside the motion masks. Furthermore, the samples for the comparison are only crops from UCF-101 and only have a resolution of 256x256 pixels. As such, a method that performs perfect on this benchmark may perform poorly at more realistic resolutions. Lastly, for some of the examples, the ground truth seems to be either the first or the second frame (like 1, 141, or 271).
Anyways, again huge thanks for contributing to the area of video frame interpolation!