Giter Club home page Giter Club logo

multiurl's Introduction

multiurl

A package to download several URL as one, as well as supporting multi-part URLs

Simple example

from multiurl import download

download(url="http://example.com/test.data",
         target="data.file")

Download from two URLs into one file

from multiurl import download

download(url=["http://example.com/test1.data",
              "http://example.com/test2.data"],
         target="data.file")

URLs types can be mixed:

from multiurl import download

download(url=["http://example.com/test1.data",
              "ftp://example.com/test2.data"],
         target="data.file")

Download parts of URLs

Provide parts of URLs as a list of (offset, length) tuples, expressed in bytes.

from multiurl import download

download(url="http://example.com/test.data",
         parts = [(0, 10), (40, 10), (60, 10)],
         target="data.file")

Download parts of URLs form several URLs

from multiurl import download

download(url=[("http://example.com/test1.data", [(0, 10), (40, 10), (60, 10)]),
              ("http://example.com/test2.data", [(0, 10), (40, 10), (60, 10)])],
         target="data.file")

License

Apache License 2.0 In applying this licence, ECMWF does not waive the privileges and immunities granted to it by virtue of its status as an intergovernmental organisation nor does it submit to any jurisdiction.

multiurl's People

Contributors

b8raoult avatar eddycmwf avatar floriankrb avatar iainrussell avatar sandorkertesz avatar tlmquintino avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

multiurl's Issues

fails to download ftp when size not understood

What happened?

multiurl fails to download from ftp servers if the SIZE command is not understood, as is the case when the ftp server is in ASCII mode. The size information is only used to construct a progress bar, therefore this is non essential for functionality.

(This may be a misconfiguration server side, however failing to download because we can't build a progress bar is not ideal)

What are the steps to reproduce the bug?

import multiurl
url="ftp://USERNAME:[email protected]/sl-c3s/Products/... .../dt_global_twosat_phy_l4_199301_vDT2021-M01.nc"
path = "temp.nc"

multiurl.download(url, path)

Please contact me offline for the full URL, USERNAME and PASSWORD.

Version

0.2.1

Platform (OS and architecture)

MacOS, Ubuntu

Relevant log output

---------------------------------------------------------------------------
error_perm                                Traceback (most recent call last)
Cell In[4], line 1
----> 1 multiurl.download(url, path)

File ~/miniconda3/envs/CDS/lib/python3.10/site-packages/multiurl/downloader.py:111, in download(url, target, **kwargs)
    110 def download(url, target, **kwargs):
--> 111     return Downloader(url, **kwargs).download(target)

File ~/miniconda3/envs/CDS/lib/python3.10/site-packages/multiurl/base.py:119, in DownloaderBase.download(self, target)
    115     download = target
    117 LOG.info("Downloading %s", self.url)
--> 119 size, mode, skip, trust_size = self.estimate_size(download)
    121 with self.progress_bar(
    122     total=size,
    123     initial=skip,
    124     desc=self.title(),
    125 ) as pbar:
    126     with open(download, mode) as f:

File ~/miniconda3/envs/CDS/lib/python3.10/site-packages/multiurl/ftp.py:51, in FTPDownloaderBase.estimate_size(self, target)
     48 self.filename = os.path.basename(o.path)
     49 self.ftp = ftp
---> 51 return (ftp.size(self.filename), "wb", 0, True)

File ~/miniconda3/envs/CDS/lib/python3.10/ftplib.py:630, in FTP.size(self, filename)
    628 '''Retrieve the size of a file.'''
    629 # The SIZE command is defined in RFC-3659
--> 630 resp = self.sendcmd('SIZE ' + filename)
    631 if resp[:3] == '213':
    632     s = resp[3:].strip()

File ~/miniconda3/envs/CDS/lib/python3.10/ftplib.py:281, in FTP.sendcmd(self, cmd)
    279 '''Send a command and return the response.'''
    280 self.putcmd(cmd)
--> 281 return self.getresp()

File ~/miniconda3/envs/CDS/lib/python3.10/ftplib.py:254, in FTP.getresp(self)
    252     raise error_temp(resp)
    253 if c == '5':
--> 254     raise error_perm(resp)
    255 raise error_proto(resp)

error_perm: 550 SIZE not allowed in ASCII mode


### Accompanying data

_No response_

### Organisation

ECMWF

Publish wheel to PyPi

Is your feature request related to a problem? Please describe.

On PyPi, this package is only published with its source files, but not as a Python wheel. Therefore, installing the package requires running the build step locally. On platforms where this is not (easily) possible, e.g. inside a Pyodide environment, this limitation precludes using multiurl.

Describe the solution you'd like

This package is a pure Python package which does not depend on any platform-dependent C code. Therefore, a pure Python wheel (one that ends in *py3-none-any.whl) can be published to PyPi. This published wheel can then be downloaded to install multiurl without running the build script locally. This is useful, e.g. to use the package inside Pyodide.

This should be as simple as switching the build step from

pip install setuptools wheel twine
python setup.py sdist

to

pip install build twine
python -m build

Describe alternatives you've considered

No response

Additional context

No response

Organisation

University of Helsinki

Fix the CI actions, tests and pypi config

What happened?

  • The unit-tests hang due to URLs being unreachable
  • The Pypi release actions are very repetitive and inefficient
  • The versioning for PyPi releases is not correctly configured (should not require manual setting)
  • Testing on old versions of python

What are the steps to reproduce the bug?

Trigger the CI actions
Release a version

Version

0.2.4

Platform (OS and architecture)

NA

Relevant log output

No response

Accompanying data

No response

Organisation

No response

Conditional progress bar

Is your feature request related to a problem? Please describe.

The progress bar is always active, and there is no easy way to disable it, as the call to
tqdm contains "disable=False".
Omitting this argument seems to do exactly what I would like to have: "If set to None, disable on non-TTY".

It would be great if there would be a way to disable the progress bar in non-TTY situations.

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

No response

Organisation

No response

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.