I was through hell trying to make a tts to work in my gpu. MI50, and after a week, this is the first one that has gpu acceleration, that sounds good, and that respects the accent i wanted. Its great. I wanted to use it to transcribe a lot of books, to listen to them when im doing other things. Also for things like having the voice of the author of the book reading the book.
I was playing with it for one day, and i noticed the 32 steps gives a lot of quality and is almost RTF 1.32 on my gpu, so i changed the steps in src/omnivoice.cpp to 16 and i got faster than real time (0.66) , the quality drops, but sometimes i just need a quick and dirty tts (for news) and i can bear not having professional quality. So my suggestion is creating a --steps switch, so people can adjust how much quality/speed needs.
Again this project is awesome, the vulkan part gives a lot of life to old hardware.
I was through hell trying to make a tts to work in my gpu. MI50, and after a week, this is the first one that has gpu acceleration, that sounds good, and that respects the accent i wanted. Its great. I wanted to use it to transcribe a lot of books, to listen to them when im doing other things. Also for things like having the voice of the author of the book reading the book.
I was playing with it for one day, and i noticed the 32 steps gives a lot of quality and is almost RTF 1.32 on my gpu, so i changed the steps in src/omnivoice.cpp to 16 and i got faster than real time (0.66) , the quality drops, but sometimes i just need a quick and dirty tts (for news) and i can bear not having professional quality. So my suggestion is creating a --steps switch, so people can adjust how much quality/speed needs.
Again this project is awesome, the vulkan part gives a lot of life to old hardware.