build pytorch dataset from single large json file
I have a large json file (>10GB) for training on a 2-node GPU. The dataset is used to build a data loader:
How to use random_split with percentage split (sum of input lengths does not equal the length of the input dataset)
I tried to use torch.utils.data.random_split
as follows: