I am running dense captioning GRiT model and the outputs are applied in pictures. I want the caption itself. How can I extract it?
Link of the project:
https://github.com/JialianW/GRiT
Here’s the demo file:
https://github.com/JialianW/GRiT/blob/master/demo.py
This is the demo file of Grit.
I made a txt file and tried to have predictions inside it but it didn’t work!
Also I couldn’t find the same question in the internet