I have a JSON file that’ll be used as data for a NER model.
It has a sentence and the relevant entities in that specific sentence.
I want to create a function that will generate a BIO-labeled string for each sentence according to the entities
for example the following object from the JSON file
{
"request": "I want to fly to New York on the 13.3",
"entities": [
{"start": 16, "end": 23, "text": "New York", "category": "DESTINATION"},
{"start": 32, "end": 35, "text": "13.3", "category": "DATE"}
]
}
“I want to fly to New York on the 13.3”
The corresponding BIO label will be
“O O O O O B-DESTINATION I-DESTINATION O O B-DATE”
where B-category is the beginning of that category
I-category stands for inside and O for outside.
I’m looking for a Python code to iterate on each object in the JSON file that will generate a BIO-label for it.
change the JSON format if necessary