Description:
I’m encountering an issue in my Python code where modifying os.path.join seems to affect dataset loading differently across functions. Below is a simplified version of my code and the observed console output:
Code:
import copy
import os
from pinecone_datasets import load_dataset
datasetName2 = "langchain-python-docs-text-embedding-ada-002"
def get_dataset2():
tmp = copy.deepcopy(os.path.join)
os.path.join = lambda *s: "/".join(s)
dataset = load_dataset(datasetName2)
os.path.join = tmp
return dataset
def get_dataset4():
os.path.join = lambda *s: "/".join(s)
dataset = load_dataset(datasetName2)
return dataset
def main():
dataset = get_dataset2()
print("Dataset loaded (get_dataset2):", len(dataset) != 0)
dataset = get_dataset4()
print("Dataset loaded (get_dataset4):", len(dataset) != 0)
if __name__ == "__main__":
main()
Console Output:
Dataset loaded (get_dataset2): False
Dataset loaded (get_dataset4): True
Request for Help:
Why does modifying os.path.join affect the dataset loading differently in these functions and how to correctly handle such modifications. Should I ensure to restore os.path.join differently or use an alternative approach altogether?
Any help or guidance would be greatly appreciated. Thank you!