In a machine learning data pre-processing pipeline, the pipeline steps are normally serialized or saved as pickles or as layers in a model so they can be loaded again later for serving or predicting thereby preserving the transform / fit parameters of each step derived from the original training data.
Why are there no high-level python wrappers that instead allow the parameters or attribute values of processing steps to be returned as data, thereby being able to be saved for example in a database without needing to save the entire object?
This would then allow the processing steps for serving / predicting to be newly created at predict time and configured using the saved parameters and attribute values loaded from the database.
I cannot find any python libraries or wrappers of existing libraries (e.g. TensorFlow, scikit-learn, PyTorch, etc.) that provides a high-level API for pipeline step parameter saving and setting.
Why do they not exist?
It seems to me that it would be useful for portability and abstracting away from the implementation library, and for inspection and debugging of processing steps.
I have looked at sklearn get_attrs and set_attrs, but they become very complex, especially where nested transforms are used and need to be tree-walked.
Am I missing a fundamental concept or something?
2