Created by: KUNAL1612
Patch Description Added the ability to generate memory profiling information and save it in the checkpoint directory. Chose to go with PyTorch profiler over other alternatives because of its ability to offer better stack traces.
Testing steps Ran training for 8m and 125m models to generate traces and observed these traces.