Relative Content

Tag Archive for pythonnlpspacyspacy-3

Memory usage when using spaCy Doc extensions

Issue Before preprocessing my data with spaCy, I typically have my data stored in a Pandas Series. Since I’d like to preserve the index for each document before serializing my Docs, I decided to use the extension attribute. However, I noted a dramatic increase in the memory usage until my system runs out of memory. […]