Merge DatasetManager / GenericDataset / OCRDatasetManager / OCRDataset classes
Currently, to deal with data loading, we have 4 classes:
- OCRDataset https://gitlab.com/teklia/atr/dan/-/blob/06f97eccc8cf7714ee19a6e1a5d418ac2ba192ed/dan/manager/ocr.py#L48
- OCRDatasetManager https://gitlab.com/teklia/atr/dan/-/blob/06f97eccc8cf7714ee19a6e1a5d418ac2ba192ed/dan/manager/ocr.py#L11
- DatasetManager https://gitlab.com/teklia/atr/dan/-/blob/06f97eccc8cf7714ee19a6e1a5d418ac2ba192ed/dan/manager/dataset.py#L20
- GenericDataset https://gitlab.com/teklia/atr/dan/-/blob/06f97eccc8cf7714ee19a6e1a5d418ac2ba192ed/dan/manager/dataset.py#L204
This makes the code very difficult to read and follow, it can be greatly simplified. We can keep only one class, named OCRDataset
, that extends torch.data.Dataset.