Replace the numpy.meshgrid() with more efficient torch.meshgrid() (!111) · Merge requests · Avinash Paliwal / Super-SloMo

Open Cui Jinku requested to merge github/fork/CuiJinku/master into master Mar 09, 2023

What does this PR do?

This PR improves the performance of the backwarp class.

The backwarp class is used for creating backwarping objects. The class constructor calls numpy.meshgrid() and torch.tensor() to create a grid including two tensor objects.

According to my profiling script, the similar API provided by the torch module has far better performance. The torch.meshgrid() has 25X speedup on a single NVIDIA 3090 GPU.

Analysis

I compare the trace of the two different implementations of meshgrid() functions from torch and numpy modules. The reasons for the performance difference would be:

The np.meshgrid() generates the numpy objects on CPU and then copy it to GPU. The copy process incurs extra function call to aten::to. The aten::to takes 0.115ms
The torch.meshgrid() takes 0.037ms, however, the numpy.meshgrid() takes 0.099ms. It indicates about 3X time difference.
The total time for generating the grid with torch.meshgrid() is 0.113ms. The total time for generating the grid with np.meshgrid() is 0.370ms ---- 3X speedup.