I’m pretty new in AI models but I have some specific usage in mind and wanted to get some hints how to do it. More specifically, I’d like to remove my hands from videos on which I’m operating a mascot (to move it) to make it looks like the mascot moves on its own. I found ProPainter AI which looks prety amazing but the effect in my use case is not impressive. I wanted to know whether training a model might improve the result. I put some example of current usage here: https://github.com/sczhou/ProPainter/issues/80.
My questions:
- Owner of the library used Youtube-VOS videos to train the model. Will the result be better if to the traning dataset I will add more movies where I’m operating a mascot (lets say 100-200 additional movies with that specific mascot)
- How long does it take to train such a model? I want to do this on Azure N-series VM but just wanted to know what the cost might be.
- Will the video quality used for training impact the result? For example using 480p vs hd/full-hd? I guess that will significantly increase training time
I would really appreciate some tips 🙂