Giter Club home page Giter Club logo

ipproxypool's Introduction

Golang实现的IP代理池

采集免费的代理资源为爬虫提供有效的IP代理

系统功能

  • 自动爬取互联网上公开的免费代理IP
  • 周期性验证代理IP有效性
  • 提供http接口获取可用IP

系统架构

architecture image

代理池设计

代理池由四部分组成:

  • Fetcher:

代理获取接口,目前有几个 免费代理源 ,每调用一次就会抓取这些网站最新的代理放入Channel,可自行 添加额外的代理获取接口

  • Channel:

临时存放采集来的代理,通过访问稳定的网站去验证代理的有效性,有效则存入数据库

  • Schedule:

用定时的计划任务去检测数据库中代理IP的可用性,删除不可用的代理。同时也会主动通过 Fetcher 去获取最新代理

  • Api:

代理池的访问接口,提供 get 接口输出 JSON ,方便爬虫直接使用

目前支持的代理

代理获取接口,目前抓取这几个网站的 免费代理 ,当然也支持自己扩展代理接口;

安装及使用

源码安装

# 克隆项目
git clone https://github.com/wuchunfu/IpProxyPool.git

# 切换项目目录
cd IpProxyPool

# 修改数据库信息
vi conf/config.yaml

host: 127.0.0.1
dbName: IpProxyPool
username: IpProxyPool
password: IpProxyPool

# 执行 sql 脚本,创建数据库表
source docs/db/mysql.sql

# 安装go依赖包
go list (go mod tidy)

# 编译
go build IpProxyPool.go

# 赋予可执行权限
chmod +x IpProxyPool

# 运行
./IpProxyPool proxy-pool

Docker 安装

Docker 请自行安装,安装完 docker 后查看是否安装 docker-compose 执行这个命令查看是否成功安装 docker-compose, docker-compose -version

# 克隆项目
git clone https://github.com/wuchunfu/IpProxyPool.git

# 进入项目目录
cd IpProxyPool

# 执行以下命令启动
docker-compose -f docker-compose.yaml up -d

# 执行以下命令停止
docker-compose -f docker-compose.yaml down

访问

# web 访问
http://127.0.0.1:3000

# or
# 随机输出可用的代理
curl http://127.0.0.1:3000/all

# 随机输出HTTP代理
curl http://127.0.0.1:3000/http

# 随机输出HTTPS代理
curl http://127.0.0.1:3000/https

计划任务

诚挚的感谢

  • 首先感谢您的使用,如果觉得程序还不错也能帮助您解决实际问题,不妨添个赞以鼓励本人继续努力,谢谢!
  • 如果您对程序有任何建议和意见,也欢迎提交issue。
  • 当然,如果您愿意贡献代码和我一起改进本程序,那再好不过了。

注意

本代码库仅用于学习研究使用,请勿用于非法用途,本人不承担由此带来的任何法律问题。

交流

欢迎关注 全栈公园 ,有什么问题可以在 全栈公园 公众号输入 开源交流 进行咨询

全栈公园

ipproxypool's People

Contributors

dependabot-preview[bot] avatar dependabot[bot] avatar jjlaaa avatar wuchunfu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

ipproxypool's Issues

docker 构建失败

=> ERROR [builder 5/6] RUN xz -d -c /usr/local/upx-3.96-amd64_linux.tar.xz | tar -xOf - upx-3.96-amd64_linux/upx > /bin/upx && chmod a+x /bin/upx 0.4s

[builder 5/6] RUN xz -d -c /usr/local/upx-3.96-amd64_linux.tar.xz | tar -xOf - upx-3.96-amd64_linux/upx > /bin/upx && chmod a+x /bin/upx:
#14 0.418 /bin/sh: xz: not found
#14 0.418 tar: short read
#14 0.418 tar: upx-3.96-amd64_linux/upx: not found in archive


executor failed running [/bin/sh -c xz -d -c /usr/local/upx-3.96-amd64_linux.tar.xz | tar -xOf - upx-3.96-amd64_linux/upx > /bin/upx && chmod a+x /bin/upx]: exit code: 1
ERROR: Service 'proxypool' failed to build : Build failed

Docker方式构建,运行一段时间后无法响应

刚部署时可以使用,运行一段时间后无响应

日志就是checkIP的一些信息

time="2022-05-16T22:23:11+08:00" level=warning msg="testIp: http://213.6.98.169:8080, testUrl: http://httpbin.org/get: error msg: Get \"http://httpbin.org/get\": proxyconnect tcp: dial tcp 213.6.98.169:8080: i/o timeout" func=github.com/wuchunfu/IpProxyPool/middleware/storage.CheckIp file="github.com/wuchunfu/IpProxyPool/middleware/storage/filter.go:73"

此时访问:3000/all 阻塞无响应

Panic

INFO[0006]/Users/allen/go/src/github.com/IpProxyPool/fetcher/fetcher.go:17 github.com/wuchunfu/IpProxyPool/fetcher.Fetch() Fetch url: http://www.66ip.cn/100.html
INFO[0006]/Users/allen/go/src/github.com/IpProxyPool/fetcher/ip66/ip66.go:47 github.com/wuchunfu/IpProxyPool/fetcher/ip66.Ip66() [66ip] fetch done
INFO[0006]/Users/allen/go/src/github.com/IpProxyPool/run/run.go:64 github.com/wuchunfu/IpProxyPool/run.run() All getters finished.
ERRO[0007]/Users/allen/go/src/github.com/IpProxyPool/models/ipModel/ipModel.go:44 github.com/wuchunfu/IpProxyPool/models/ipModel.GetIpByProxyHost() get ip: , error msg: Error 1146: Table 'proxypool.proxy_ip' doesn't exist
ERRO[0007]/Users/allen/go/src/github.com/IpProxyPool/models/ipModel/ipModel.go:44 github.com/wuchunfu/IpProxyPool/models/ipModel.GetIpByProxyHost() get ip: , error msg: Error 1146: Table 'proxypool.proxy_ip' doesn't exist
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x15347b2]

goroutine 75 [running]:
github.com/wuchunfu/IpProxyPool/models/ipModel.SaveIp(0xc000496680)
/Users/allen/go/src/github.com/IpProxyPool/models/ipModel/ipModel.go:26 +0x72
github.com/wuchunfu/IpProxyPool/middleware/storage.CheckProxy(0xc000496680)
/Users/allen/go/src/github.com/IpProxyPool/middleware/storage/filter.go:21 +0x4a
github.com/wuchunfu/IpProxyPool/run.Task.func2(0xc000297920)
/Users/allen/go/src/github.com/IpProxyPool/run/run.go:26 +0x4c
created by github.com/wuchunfu/IpProxyPool/run.Task
/Users/allen/go/src/github.com/IpProxyPool/run/run.go:24 +0x89
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x15347b2]

goroutine 69 [running]:
github.com/wuchunfu/IpProxyPool/models/ipModel.SaveIp(0xc000323580)
/Users/allen/go/src/github.com/IpProxyPool/models/ipModel/ipModel.go:26 +0x72
github.com/wuchunfu/IpProxyPool/middleware/storage.CheckProxy(0xc000323580)
/Users/allen/go/src/github.com/IpProxyPool/middleware/storage/filter.go:21 +0x4a
github.com/wuchunfu/IpProxyPool/run.Task.func2(0xc000297920)
/Users/allen/go/src/github.com/IpProxyPool/run/run.go:26 +0x4c
created by github.com/wuchunfu/IpProxyPool/run.Task
/Users/allen/go/src/github.com/IpProxyPool/run/run.go:24 +0x89

加油💪🏻

mv docker-compose.yaml docker-compose.yml
docker-compose up -d
docker-compose down

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.