You can use the random voice by passing in 'random' as the voice name. The results are quite fascinating and I recommend you play around with it! These voices don't actually exist and will be random every time you run I've included a feature which randomly generates a voice. The reference clip is also used to determine non-voice related aspects of the audio output like volume, background noise, recording quality and reverb. These clips are used to determine many properties of the output, such as the pitch and tone of the voice, speaking speed, and even speaking defects like a lisp or stuttering. These reference clips are recordings of a speaker that you provide to guide speech generation. It accomplishes this by consulting reference clips. Tortoise was specifically trained to be a multi-speaker model. Pcm_audio = tts.tts_with_preset( "your text here", reference_clips, preset= 'fast') Tortoise can be used programmatically, like so: reference_clips = You can re-generate any bad clips by re-running read.py with the -regenerate Once all the clips are generated, it will combine them into a single file and This will break up the textfile into sentences, and then convert them to speech one at a time. python tortoise/read.py -textfile -voice random This script provides tools for reading large amounts of text. python tortoise/do_tts.py -text "I'm going to speak this" -voice random -preset fast This script allows you to speak a single phrase with one or more voices. If you want to use this on your own computer, you must have an NVIDIA GPU. I've put together a notebook you can use here: See this page for a large list of example outputs.Ĭolab is the easiest way to try this out. On a K80, expect to generate a medium sized sentence every 2 minutes. It leverages both an autoregressive decoder and a diffusion decoder both known for their low Tortoise is a bit tongue in cheek: this model I'm naming my speech-related repos after Mojave desert flora and fauna. Added ability to use your own pretrained models.Added ability to download voice conditioning latent via a script, and then use a user-provided conditioning latent.Added ability to produce totally random voices.This repo contains all the code needed to run Tortoise TTS in inference mode. Highly realistic prosody and intonation.Tortoise is a text-to-speech program built with the following priorities:
0 Comments
Leave a Reply. |