How To Convert PDFs Into AudioBooks

Written by jitendraballa2015 | Published 2021/05/10
Tech Story Tags: python3 | data-science | machine-learning | artificial-intelligence | python-tutorials | audiobooks | bots | text-to-speech | web-monetization

TLDR Python and PyCharm are easy to use and open-source tools to make an AudioBook. Python text to speech can be made with Python and it’s IDE installed on your PC. I’ll explain every line of code so even who doesn’t have knowledge of Python, can understand the code. After reading this article, you don't need to read long PDFs which contains more than 10 pages and only with help of 12 lines Python code. Here is the full code.via the TL;DR App

Isn’t it interesting? So basically after reading this article you don’t need to read long PDFs which contains more than 10 pages and only with help of 12 lines Python code. I’ll explain every line of code so even who doesn’t have knowledge of Python, can understand the code. So let’s start!!
Here’re the steps to follow for making an AudioBook:
First of all you must have Python and it’s IDE installed on your PC If you don’t have these installed you can do it just by clicking on Python and IDE. There are many IDEs but I’m giving you link of PyCharm because it is easy to use and open-source. If you don’t know IDE than no worry about it’s just a code editor for Python.
After downloading both Python and PyCharm you have to set it’s environment and for that there are many videos on YouTube so take help from them.
Let’s open IDE and create a new file and give it a name whatever you want.
Click on Terminal option then just installed Python text to speech
(pyttsx3)
library just like this :
#make sure you are connected with Internet
pip install pyttsx3 
Also install PyPDF2 library in same way:
pip install PyPDF2
Here is the full code :
    import pyttsx3
    import PyPDF2
    pdf_book = open('book.pdf','rb')
    pdf_book_reader = PyPDF2.PdfFileReader(pdf_book)
    pages_no = pdf_book_reader.numPages
    print(pages_no)
    speaker = pyttsx3.init()
    for number in range(8,pages_no):
      page_start = 
      pdf_book_reader.getPage(8)
      text = page_start.extractText()
      speaker.say(text)
      speaker.runAndWait()

    Explanation of code:

  1. In line 1 & 2 import required libraries.
  2. In line 3 , create a variable called 
    pdf_book
     which is used to open the pdf file which we want to make speak and ‘rb’ is to read in binary.
  3. In line 4,
     pdf_book_reader = PyPDF2.PdfFileReader(pdf_book)
     read the file which we want to make speak using 
    PdfFileReader
     function of 
    PyPDF2
     library.
  4. In line 5,
     pages_no = pdf_book_reader.numPages
     counting total number of pages in the file using numPages function.
  5. In line 6, 
    print(pages_no)
     displays total number of pages in the file.
  6. In line 7, 
    speaker = pyttsx3.init()
     initiate the pyttsx3 library to speak the file.
  7. In line 8, 
    for number in range(8,pages_no):
     a for loop is initiated to iterate through whole pdf file from starting page to ending page using range function.
  8. In line 9, 
    page_start = pdf_book_reader.getPage(8)
     starting page from where we want to make Python to read.
  9. In line 10, text =
    page_start.extractText()
     extracting the text from starting page using 
    extractText()
     function.
  10. In line 11, 
    speaker.say(text) 
    speaking the text from PDF file pages using say function.
  11. In line 12,
     speaker.runAndWait()
     making sure that it speaks all pages for PDF file
So now you can lay on your bed and just listen whatever PDF you want to speak anytime, anywhere from Python just by running this program.
Note: I’ll suggest you to put the PDF file in the same folder where you are writing code and it’ll reduce your trouble.

Thank you so much for reading! follow the writer of the article for more stuff on Python and Data Science.


Written by jitendraballa2015 | Learning Data Science
Published by HackerNoon on 2021/05/10