Today we're going over real-time AI voice changing using the RBC models from the previous videos. Let's go ahead and see it in action real quick. Sarayan, dorayan, voice-uh-hen-kende, sarayan And so if you haven't seen the previous videos that shows you how you can train a voice in RVC Go watch those ones first. Those are gonna be down below in the description, but with that out of the way Let's go ahead and jump into this installation, so we're gonna be at this github repository and it's gonna be wokada slash voice changer and we're gonna scroll down till we hit the English area so that You guys can read it and then when we're in here, you can read through this if you'd like We're gonna go ahead and do the Windows installation as that is what I have I believe this is for AMD right here And then this is for Mac if you take a look at the star documentation here, it says the developer does not have an AMD graphics card, so I'm assuming this is for AMD. But I downloaded this one here and so it's going to bring you to a Google Drive folder that you can download. So proceed at your own risk as you know this isn't like Hugging Face or SourceForge or anything like that. And so if you want to proceed, you can simply just go with download anyway and just download it to anywhere that you can find. Once you have it done downloading, we're going to go ahead and extract it. So right click and then go to extract all and then just click extract and it's going to extract it out of that folder. So wait for it to finish. It's going to be a pretty large file, so it might take a little bit of time. And once it's done downloading, you're going to end up with this folder here. Double click into it and then what I usually like to do is just move it out of this folder so that it's a little bit easier to access. And then here's the old folder I just delete this one and then go into the new one. So here we are you're gonna open it and you're gonna see a bunch of bunch of folders. In order to run this we simply just have to go to this start HTTP bat and then run it. So this is where it might cause some issues if you have antivirus or Windows Defender something might pop up here. Let's just go ahead and run it. So here we go actually we do have it so it says Windows protected your PC. Once again proceed at your own risk. I go ahead and click run anyway and then it's gonna go ahead and run on here. So you're gonna see the command line pop up and then you're gonna see voice changer is booting up and so while it's booting up it's gonna be downloading these other Hugging Face files that it needs for the model in order for it to work. So just wait for this to finish downloading and we'll be back when it's done. And then when it's done a Windows Defender firewall is going to pop up saying that this executable file wants to run, so we're going to go ahead and allow it access. So I'm going to go ahead and click it, and then if no window pops up, what you have to do is just rerun it. So we're going to exit out, go back to the start underscore HTTP bat, go ahead and run that, and then we'll see this window pop up here So just click start when it you when yours pops up. It'll probably sit look like something like this So we're going to use our VC But as you can see there are other options here But click our VC and then start and then here are a list of voices that we can use so let's just go ahead and head on head on over to this to kiyomi chan voice and Here are some settings we can adjust but let's just go ahead and leave that like that for now So that you can hear how it sounds and then so in this frequency zero We're gonna go ahead do crepe, S threshold all the way to 100 We're gonna do 384 for chunk 4096 for extra and then make sure you select your GPU if you have an Nvidia GPU it should be able to see it and you can also use CPU if you have a strong enough CPU for your input choose your input microphone I have this is my input for my headphones I have my Bluetooth headphones as my headphones and then in advanced settings si o 4096 for crossfade, 300 for tranquil, and then all these other settings leave the same. So the the tranquil I adjusted to 300, which is the max, because I was hearing some weird chopping in the audio. And so that seemed to help it out with that out of the way. Let's go ahead and start it and see what happens. So I'm going to click start and then it's going to take a little bit of time to warm up and then you can start hearing it so there it is let's go ahead and turn up the output and then you know it sounds like a weird alien so let's just go ahead bring this up to 12 so now it sounds a little bit closer to female and then for this voice specifically something like 23 makes it work best so well I guess 20 would make it work a little bit best so that is how you can use the pre-provided models and one thing I noticed if you train the RVC in a language it's better at speaking that language so these four models actually speak Japanese better than it speaks English already so let's head on over to how you add your own models and to do language. So these four models actually speak Japanese better than it speaks English. Alrighty, so let's head on over to how you add your own models. And to do that, we're going to go over to this edit button up here, click edit, and then we're going to click this select button. So navigate to where you have RVC installed, and we're going to go into the weights folder. I'm going to go ahead and click the marine PTH file and then for this option right here it's currently broken I believe what you would do is go to select go to logs go inside and then choose this total underscore fea.mpy file however it seems to not need that so with the version 2 version of RvC it looks like you can just upload this PTH file and it'll be done. So click now this upload button and you'll see it turn blue and now we can use the model. If you want to add your own image you can as well. So close out of that and then we're going to go ahead and click this no image area right here and then we're going to go ahead and adjust this tune. So once again let's bring this back up to around 12 and all these settings will have stayed the same and one important thing as well just to make things quicker you can click the save settings option right here, which will save these settings in case you go to a different voice. So if I go to the Tsukiyomi-chan voice and then I go back to the Marin voice, it's going to be at 12. So let's go ahead and click start and we'll see how this voice is. So this is the voice that we are changing to and it's pretty cool it's pretty good and um depending on your graphics card it adjusts the speed. What I found is that these other chunks um you can actually go a little bit faster but it starts getting slow. So here's at 320 and then 256, 192, 128, 112, 96, 80, and if I start speaking you can hear it sounds different and a little bit muffled and not as good. I could probably get away with 192 and have the audio still sound decent. However, you know, I like to keep it at 384, which is, you know, a one second delay for me isn't too bad. So we're going to go ahead and stop that. One issue that I've ran into, which I don't think is really an issue, maybe just a limitation of the models, is if you say a constant syllable, it kind of fluctuates back and forth here. Let me give you an example. I'm going to say something like like eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee But I only really notice that if I'm just saying the same syllable I don't really notice it when I'm doing the speech. So if I turn it back on and I'm just doing speech I don't actually really hear it that much. You might hear a little bit of choppiness here and there, but it's not that bad It's not that noticeable. So that's how you get it set up I'm gonna have some other videos out there that just shows me using it if you're interested in that But that's gonna be it for today's video. Hope you found something interesting And if you have any questions, let me know down below in the comments and I'll try my best to help you out See you later