Our organisation is trailing MS Fabric, and i’m trying out Notebooks. I’ve managed to create an environment and brought in NLTK and spaCy libraries. However, typically when you use spaCy you also import a corpus to work with such as the small corpus for English: en_core_web_sm
As Fabric is a ‘complete solution’ and I don’t have access to pip install / terminal or anything like that (our organisation locks this down) i’ve no clue how to initialise spaCy to start with:
# Initialize spaCy
nlp = spacy.load('en_core_web_sm')
I have to think it’s possible, as otherwise you couldn’t use the library at all?
I get the predictable error:
OSError: [E050] Can't find model 'en_core_web_sm'. It doesn't seem to be a Python package or a valid path to a data directory.
Any ideas?