Giter Club home page Giter Club logo

exceptedprism3 / pdftoaudio Goto Github PK

View Code? Open in Web Editor NEW
4.0 1.0 1.0 2.87 MB

"PDF To Audio" is a Python tool that transforms PDF documents into audio files using OCR and Text-to-Speech technology. Ideal for accessibility and auditory learning, it supports multiple languages, parallel processing, and smart rate limit handling.

License: MIT License

Python 100.00%
pdf pdf-converter pdf-to-audio pdf-to-audiobook pdf-to-text pdftoaudiobooks pdftotext python

pdftoaudio's Introduction

PDF To Audio

Convert your PDF documents into audio files effortlessly with PDF to Audio Converter. This Python script harnesses the power of Optical Character Recognition (OCR) and Google's Text-to-Speech (gTTS) service to transform written content into spoken words. Ideal for accessibility, auditory learning, or enjoying documents on-the-go.

๐ŸŒŸ Features

  • PDF Text Extraction: Utilizes pdfplumber for precise text extraction.
  • OCR Capability: Integrates pytesseract for handling image-based PDFs.
  • Text-to-Speech: Leverages Google's gTTS API for high-quality audio output.
  • Parallel Processing: Option for faster processing of multiple documents.
  • Rate Limit Management: Smart retry logic with exponential backoff.
  • Flexible CLI: Command-line interface for customizable configurations.

๐Ÿ“‹ Installation

Get started with these simple steps:

Prerequisites

  • Python 3.x
  • Required packages: pdfplumber, pytesseract, Pillow, gtts

Install Python Packages

pip install pdfplumber pytesseract Pillow gtts

Tesseract OCR

pytesseract requires Tesseract OCR. Install it from Tesseract's GitHub page.

๐Ÿš€ Usage

Command Syntax

python main.py <input_folder> [--output_folder OUTPUT_FOLDER] [--audio_folder AUDIO_FOLDER] [options]

Arguments

  • input_folder: Folder containing PDF files.
  • output_folder (optional): Folder for saving text files (defaults to script directory).
  • audio_folder (optional): Folder for saving audio files (defaults to script directory).

Options

  • --language: Language for conversion (default: 'en').
  • --parallel: Enable parallel processing (sequential by default).
  • --retry_delay: Delay in seconds for retrying conversion (default: 5).
  • --max_retries: Max retries for conversion (default: 10).

Example

python main.py ./pdfs --output_folder ./texts --audio_folder ./audios --language fr --parallel --retry_delay 2 --max_retries 3

Processes PDFs in ./pdfs, saves text to ./texts, audio to ./audios, in French, with parallel processing, a 2-second retry delay, and a maximum of 3 retries.

๐Ÿค Contributing

Your contributions are welcome! Feel free to submit bug fixes, feature requests, or documentation improvements. Check out the issues and pull requests sections.

๐Ÿ“„ License

This project is under the MIT License - see the LICENSE file for details.

pdftoaudio's People

Contributors

exceptedprism3 avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

frknltrk

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.