Giter Club home page Giter Club logo

scrape_indianculture.gov.in_release's Introduction

Read Me

v2

Added a new category : Other Collection It contains mostly catalogues. It is not yet checked. Report issues if found. Feel free to fork and submit corrections.

Info

A website by Government of India contains many Rarebooks, Manuscripts and eBooks, etc.

This script is created to collect those books.

Rarebooks are more than 1 TB. Manuscripts are more than 130 GB. eBooks size is not known.

This script is created for practising python. Please, don't abuse the website. Use this script to download only needed items.

How to Use

  1. Create a new virtual environment, source it or add to shebang in the main script.

  2. Install requests and bs4

     pip install requests
    
     pip install bs4
    
  3. cd to the directory where script is located.

  4. Run as

    a. this

     python ./v1_DownloadAllBooksFromIndianCultureGovIn_Release_270320221.py
    

    OR,

    b. make the script executable and then run directly. You must have your environment added to shebang.

     chmod +x ./v1_DownloadAllBooksFromIndianCultureGovIn_Release_270320221.py
    
     ./v1_DownloadAllBooksFromIndianCultureGovIn_Release_270320221.py
    

Automatic download

For automating download of all three categories of PDF, i.e. rare books, manuscripts, eBooks ---

Replace

        download_this_category = input('do you want to download this category of PDF? yes(y), No(n)\n')

with

        download_this_category = 'y'

And for automating download of all PDF related to each category ---

Replace

    download_this_book = input('Do you want to download this book. Yes(y), No(n)?\n')

with

    download_this_book = 'y'

scrape_indianculture.gov.in_release's People

Contributors

lalitaalaalitah avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.