I’ve been experimenting with the Kaggle API in Google Colab for a while now and I’m stuck with the following problem. I’m able to easily authenticate my credentials, and got no problem downloading whole datasets, as well as specific files using:
!kaggle datasets download -d <user>/<dataset>
!kaggle datasets download <user>/<dataset> -f <specific_file>
However, I’m not able to get the list of all the files in a dataset (which I would like to save in a variable).
Whenever I’m using:
api.dataset_list_files('<user>/<dataset>').files
I’m getting a list with blank spaces equal to the number of files in the respective dataset. I didn’t find a mention to anything like that in the internet, so I guess that maybe it should be a recent bug/problem. In addition, I can actually use:
!kaggle datasets files <user>/<dataset>
To correctly list the first 20 files, but it isn’t very helpful, as I don’t know how to see the rest nor how to save it in a variable.
I suppose that maybe I can come up with a complex solution that employs Selenium or something like that, but I think that would a bit of an overkill. That’s why I come here in search of the wisdom of more seasoned Kaggle API users, or someone who has also faced and solved this problem. Could you help me, please?