Giter Club home page Giter Club logo

aparatdownloader's Introduction

Aparat Downloader

Using this project you can easily download all of the videos available in an Aparat playlist. you can choose your quality and specifiy the folder to download.

License: MIT

Contributing

we really appreciate contributing in this repo. please ask any question you have and feel free to file issues if neede. Please refer to each project's style and contribution guidelines for submitting patches and additions. In general, we follow the "fork-and-pull" Git workflow.

  1. Fork the repo on GitHub
  2. Clone the project to your own machine
  3. Commit changes to your own branch
  4. Push your work back up to your fork
  5. Submit a Pull Request (PR) so that we can review your changes

NOTE: Be sure to merge the latest from "upstream" before making a pull request!

Requirements

you should install following packages (use steps above)

  • selenium
  • webdriver_manager
  • wget
  • os
  • tkinter

Installation

in order to run the project source code, you should take the following steps:

git clone https://github.com/edrisranjbar/AparatDownloader.git
cd AparatDownloader
virtualenv -p=python3 .venv
.\.venv\Scripts\activate
bash pip install -r requirements.txt
python AparatDownloader.py

aparatdownloader's People

Contributors

dependabot[bot] avatar edrisranjbar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

aparatdownloader's Issues

write a good README file

we should update the readme file based and get some idea from other public opensource repos.

some of the implementations we want to add:

  • add a better description
  • add license badge
  • add contributes list as a table
  • update contributing section
  • improve installation guideline
  • add language and tech used in the project

Selenium4 compatibility problem

We are getting error below when trying to run the script.
DeprecationWarning: executable_path has been deprecated, please pass in a Service object

Problem

notice that we are using a very old version of selenium in project (since I wrote this project 3 years ago😎) and the new version of selenium had some changes that forces us to use chrome driver Service; so we need to migrate our code to the new selenium.

A proper solution

we should do the import stuff and change our code to support Google chrome services.
Here's the code for importing the new Service thing

from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager

and then for Service implementation:

browser = webdriver.Chrome(service=Service(ChromeDriverManager().install()))

use a better way of waiting for page to be loaded

currently we are using time.sleep() method to tell system to wait for X seconds and after that run the next line of code. but we need a proper way to tell system wait until page loaded.

Problem

we do not know exactly how much would it take to load up Aparat pages. so putting a large number as sleep value, would make the script totally inefficient.

Solution

there is a proper way to do such a thing. we can use WebDriverWait which exists in selenium.webdriver.support.ui and takes 2 argument, first one would be the driver, and the second one is timeout value.

Use PyQt5 for GUI

for several reasons I prefer PyQT5 over tkinter, so we need to ship the code to pyqt. pyqt also has a Graphic designer that we can drag and drop the UI, then just import the UI file in the python script and add functionality to that.

selenium Attribute Error while trying to download

when we click on download button on GUI; we get an error in the terminal that says AttributeError: 'WebDriver' object has no attribute 'find_elements_by_css_selector'. the whole error log is below:

DevTools listening on ws://127.0.0.1:2621/devtools/browser/4a3d03fb-d71e-4bf1-93e9-67f63797b891
[6084:5316:0803/110134.485:ERROR:device_event_log_impl.cc(214)] [11:01:34.485] Bluetooth: bluetooth_adapter_winrt.cc:1074 Getting Default Adapter failed.
Exception in Tkinter callback
Traceback (most recent call last):
  File "C:\Users\edris\AppData\Local\Programs\Python\Python310\lib\tkinter\__init__.py", line 1921, in __call__
    return self.func(*args)
  File "C:\Users\edris\Desktop\Projects\AparatDownloader\AparatDownloader.py", line 80, in download
    downloader.downloadFromPlayList(txtUrl.get())
  File "C:\Users\edris\Desktop\Projects\AparatDownloader\AparatDownloader.py", line 54, in downloadFromPlayList
    links = browser.find_elements_by_css_selector(
AttributeError: 'WebDriver' object has no attribute 'find_elements_by_css_selector'
[6084:11296:0803/110232.263:ERROR:util.cc(127)] Can't create base directory: C:\Program Files\Google\GoogleUpdater
Traceback (most recent call last):
  File "C:\Users\edris\Desktop\Projects\AparatDownloader\AparatDownloader.py", line 94, in <module>
    root.mainloop()
  File "C:\Users\edris\AppData\Local\Programs\Python\Python310\lib\tkinter\__init__.py", line 1458, in mainloop
    self.tk.mainloop(n)
KeyboardInterrupt

What the hack is wrong Here? 🙄

Actually selenium 3.0 uses selectors in a different way that it used to be 3 years a go. so we need to ship to code to the new version.

How to fix the issue?

as Docs says we should import from selenium.webdriver.common.by import By and use browser.find_element(By.XPATH,selector)

write unit tests

we should write tests to make sure that the scrapper still works fine.

TODO:

Here are some steps to take

  • test that all requirements are met
  • test that URL validation works
  • test we can get download link of a single video from URL
  • test that app returns proper error when URL does not exists
  • test that we can download and save the actual video file
  • test that we can get all video links from a playlist
  • test that the given URL is a playlist
  • test that we can get all resolutions available for a video

Migrate from selenium to BeautifulSoup

we should migrate from selenium to Beautiful Soup python library to make the script faster and more efficient. the difference between these two libraries is that selenium uses browser driver to open up a web page in a real browser on machine and crawl in it; but with requests and bs4; we can easily do the web scrapping stuff and return back the required data from downloading process.

Here's the documentation link to Beautiful Soup (bs4) library, Click here!

INVALID: Open up browser immediately after running the script

right now, after running the script, it's going to open a new chrome browser with an empty tab; which is not good at all. because the user looses focus from GUI app. so we need to open browser after user clicks download.
This way user stays focused on what his going to do.

Separate core scrapping functionality from GUI

we need to refactor codes so that GUI codes would be separated from the rest of the code. in our case we can create a new class called Scrapper and put all of the codes about curling and web scrapping inside that class. then we can import modules from that class as we want to.

TODO:

Here are some of the steps we should take in order to make the codes cleaner.

  • have separate Core class for scrapping called Scrapper
  • put scrapper in another file and import in AparatDownloader.py
  • rename methods
  • refactor Scrapper class
  • main file should only contains GUI stuff

[Docs]: Dependencies not complete in ReadMe.md and bloated in requirements.txt

The README.md doesn't include a full list of dependencies that are required.
For example it's also required to install webdriver_manager.

Therefore i tried to look into the requirements.txt which includes a lot of dependencies that are not even required for testing.

Suggestion:
Add every important module that is required for running into requirements.txt and make a secondary file file for testing requirements.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.