Support DistributedDataParallel
Closes #116 (closed)
With this fix, we can
- Train on multiple GPUs
- Continue a training started with 1 GPU on multiple GPUs
- Continue a training started with multiple GPUs on 1 GPU
- Load any pre-trained model (trained with 0, 1 or more GPUs)
It also closes #143 (closed) and #118 (closed)
Edited by Yoann Schneider