Implement split command
Create a new split.py module where the code related to data (mostly KaldiPartitionSplitter renamed to Partitioner).
We will update the command atr-data-generator split so that it
- creates a
Partitionerinstance filled with CLI args - calls
create_partitionsmethod on it
Needed CLI args are:
-
--train-size, float, defaults to None -
--val-size, float, defaults to None -
--test-size, float, defaults to None -
output,pathlib.Pathpath where the data will be stored (no more.../Lineslike this btw, look for files inargs.outputdirectly) -
--existing, boolean flag, behave like the currentsplit.use_existing_splitargument (maybe useless? discuss with the users before)
Notes:
- either
--existingor at least two of--x-sizemust be specified - if
--existingis used, the--x-sizeare ignored.
Edited by Yoann Schneider