I have a custom dataset, which I loaded into a dataframe, and want to use AIF360 for fair ML.
I always one-hot-encode the categorical features with the get_dummies()-method from pandas.
Now, do I have to use the BinaryLabelDataset class for one-hot-encoded data and the StandardDataset class is also for custom datasets but which have not yet been one-hot-encoded?
Basically, I am unsure which class to use in which case (for a one-hot-encoded dataframe = BinaryLabelDataset? – for a dataframe with no transformations = StandardDataset?)
I read trough the documentation from AIF360 and the API reference guide. However, it is not quite clear to me which one I should use.
I know that the BinaryLabelDataset class is built upon the StandardDataset class.
In addition, the StandardDataset class can also do some data transformations.
Also, there is the sklearn-compatible API which I could use.
Azeglio Martinelli is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.