I have multiple .md files that I want to process. I want to add them all under a single name in catalog. But .md files isn’t supported by the framework.
Example: I have multiple files on data/01_raw/folder_name/ , I want to be able to read all the .md files in there.
1
I suggest looking into using a PartitionedDataSet
in Kedro to manage multiple Markdown files. This approach allows you to handle all files within a directory collectively. You can find more details in the Kedro documentation on PartitionedDataSet.
1
It worked. Here is the code that I used:
markdown_dataset:
type: partitions.PartitionedDataset
path: data/01_raw/md_files/
dataset:
type: text.TextDataset
fs_args:
open_args_load:
encoding: "utf-8"
filename_suffix: ".md"