Creating Text-To-Speech with Python and gTTS

Creating Text-to-speech files has been a dream for many of us since we were kids. Now, with Python, those dreams can become true with few lines. Let’s see how!

On this lesson you’ll learn how to:

  • Create an mp3 from a string of text
  • Ask the user for a text and create an mp3
  • Ask the user for a text file, extract the text and create an mp3
  • How to play mp3 with Python.
Video version of this lesson

Table of contents
Preparation
The code
Conclusion

Preparation

This lesson is best watched than read, as you can listen to the files I create:

But if you can’t (or don’t want to) watch a video, you can do it too without a problem.

First, create an environment. I have used pipenv but you can use your virtual environment of choice.

Then, we only need to install gTTS with ‘pip install gtts‘.

Once you have created your virtual environment and installed gTTS, you are set. We are only going to use this and the Python core libraries.

Ah, and don’t forget to create a text file with the .txt extension


The code

To be honest, the code is pretty straight-forward, as the gTTS library does all the heavy lifting, so I’m going to give you blocks of code and a brief explanation.

First, create a file and import two Python libraries and set our options:

import os

from gtts import gTTS


# Options
text_to_read = "This is just a test using GTTS, a Python package library"
language = 'en'
slow_audio_speed = False
filename = 'my_file.mp3'

Reading from a string

Now, we create the first of our functions that will read the text from text_to_read, with the language voice and a normal speed, as slow_audio_speed is false.

"""
 Reading from a string
"""


def reading_from_string():
    audio_created = gTTS(text=text_to_read, lang=language,
                         slow=slow_audio_speed)
    audio_created.save(filename)

    os.system(f'start {filename}')

We create a gTTS object with the options we created at the start, we save it to the filename (that’s it, my_file.mp3). Now we are done, but we want to play the file we have just created. So, we use the os library to play the file with the name filename on the current folder.

Reading from the user’s input

"""
 Reading from user input
"""


def reading_from_user():
    user_input = input("What text should I read for you?\n")

    audio_created = gTTS(text=user_input, lang=language, slow=slow_audio_speed)
    audio_created.save(filename)

    os.system(f'start {filename}')

Pretty much the same as before. With only one difference: Now we are asking the user to introduce some text to transform it into an audio file.

Reading from a file

def reading_from_file():
    file_to_read = input("Please, insert the name of a file to read:\n") + '.txt'
    f = open(file_to_read, 'r')
    file_text = f.read()
    f.close()

    audio_created = gTTS(text=file_text, lang=language, slow=slow_audio_speed)
    audio_created.save(filename)
    
    os.system(f'start {filename}')

This is the most complex function yet still pretty easy to understand. We ask the user to introduce the name of a file, we add the .txt extension, we open and read the text, and as always we create the mp3.

Running the script

We only need to declare which function we will use at the end of the code:

if __name__ == '__main__':
    reading_from_string()

You can easily switch the function called.

Or don’t set any function and run the python interpreter, and keep asking for functions to run with “python –i NAME_FILE.py”


Conclusion

I found the gTTS library this weekend and I played around with it and I had a lot of fun, and I wanted to share that with you. The library goes deeper, so here are the docs if you want to learn more: gTTS docs.


My Youtube tutorial videos

Final code on Github

Reach to me on Twitter

Read more tutorials