Giter Club home page Giter Club logo

baidu_image's Introduction

百度图片爬虫

用于爬取一定类型的图片进行深度学习训练

Start

pip install -r requirement -i https://pypi.douban.com/simple

Use

python function.py

效果

使用文档

get_images_rul

class Crawler:

    @staticmethod
    def get_images_url(word: str, num: int, original: bool = True,
                       timeout: int = __CONCURRENT_TIMEOUT) -> (bool, bool, list):

参数

  • word:str:搜索关键词
  • num:int:搜索数量
  • original:bool, optional:是否下载原图,默认为False
  • timeout:int, optional:请求timeout,默认为60s

返回

  • net:bool:网络连接状态
  • num:bool:图片数量是否足够
  • urls:list:获取的urls,每项为一个dict,

download_images

class Crawler:

    @staticmethod
    def download_images(urls: list, rule: tuple = ('.png', '.jpg'),
                        path: str = 'download', timeout: int = __CONCURRENT_TIMEOUT,
                        concurrent: int = __CONCURRENT_NUM, command: bool = True) -> (int, int):

参数

  • urls: list: 需要爬的图片列表,格式与get_images_url返回的相同
  • rule: tuple, optional: 允许下载的格式,默认为('.png', '.jpg')
  • path: str, optional: 图片下载的路径,默认为'download'
  • timeout: int, optional: 请求 timeout, 默认为60(s)
  • concurrent: int, optional: 并行下载的数量,默认为100
  • command: bool, optional: 是否在控制台显示进度条,默认为True

返回

  • success: int: 下载成功的数量
  • failed: int: 下载失败的数量

baidu_image's People

Contributors

feng1909 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.