▲ | SonOfLilit 3 days ago | |
The description is amazing, but the demo video feels underwhelming. Available music generation models sound much more musical and have much better diction on vocals. | ||
▲ | codedokode 2 days ago | parent [-] | |
This might be due to quality of the dataset because Nvidia seems to be not using copyrighted commercial recordings (if I read their paper properly). It is difficult to compete with those who have used larger and higher quality dataset without permission. |