Giter Club home page Giter Club logo

squid-in-a-can's Introduction

Transparent Squid in a container

This is a trivial Dockerfile to build a proxy container. It will use the famous Squid proxy, configured to work in transparent mode.

Why?

If you build a lot of containers, and have a not-so-fast internet link, you might be spending a lot of time waiting for packages to download. It would be nice if all those downloads could be automatically cached, without tweaking your Dockerfiles, right?

Or, maybe your corporate network forbids direct outside access, and require you to use a proxy. Then you can edit this recipe so that it cascades to the corporate proxy. Your containers will use the transparent proxy, which itself will pass along to the corporate proxy.

How?

You can use the squid proxy directly via docker and iptables rules, there is also a docker-compose.yml for convenience to use docker-compose up command to launch the system. For more information on tuning parameters see below.

Using Docker and iptables directly.

You can manually run these commands

docker run --net host -d jpetazzo/squid-in-a-can
iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to 3129 -w

After you stop you will need to cleanup the iptables rules:

iptables -t nat -D PREROUTING -p tcp --dport 80 -j REDIRECT --to 3129 -w

Using Compose

There is a docker-compose.yml file to enable launching via docker compose and a separate container which will setup the iptables rules for you. To use this you will need a local checkout of this repo and have docker and compose installed.

Run the following command in the same directory as the docker-compose.yml file:

docker-compose up

Result

That's it. Now all HTTP requests going through your Docker host will be transparently routed through the proxy running in the container.

If you your tproxy instance goes down hard without cleaning up use the following command:

iptables -t nat -D PREROUTING -p tcp --dport 80 -j REDIRECT --to 3129 -w

Note: it will only affect HTTP traffic on port 80.

Note: traffic originating from the host will not be affected, because the PREROUTING chain is not traversed by packets originating from the host.

Note: if your Docker host is also a router for other things (e.g. if it runs various virtual machines, or is a VPN server, etc), those things will also see their HTTP traffic routed through the proxy. They have to use internal IP addresses, though.

Note: if you plan to run this on EC2 (or any kind of infrastructure where the machine has an internal IP address), you should probably tweak the ACLs, or make sure that outside machines cannot access ports 3128 and 3129 on your host.

Note: It will be available to as a proxy on port 3128 on your local machine if you would like to setup local proxies yourself.

What?

The jpetazzo/squid-in-a-can container runs a really basic Squid3 proxy. Rather than writing my own configuration file, I patch the default Debian configuration. The main thing is to enable intercept on another port (here, 3129). To update the iptables for the intercept the command needs the --privileged flag.

Then, this container should be started using the network namespace of the host (that's what the --net host option is for). Another strategy would be to start the container with its own namespace. Then, the HTTP traffic can be directed to it with a DNAT rule. The problem with this approach, is that Squid will "see" the traffic as being directed to its own IP address, instead of the destination HTTP server IP address; and since Squid 3.3, it refuses to honor such requests.

(The reasoning is, that it would then have to trust the HTTP Host: header to know where to send the request. You can check CVE-2009-0801 for details.)

Tuning

The docker image can be tuned using environment variables.

MAX_CACHE_OBJECT

Squid has a maximum object cache size. Often when caching debian packages vs standard web content it is valuable to increase this size. Use the -e MAX_CACHE_OBJECT=1024 to set the max object size (in MB)

DISK_CACHE_SIZE

The squid disk cache size can be tuned. use -e DISK_CACHE_SIZE=5000 to set the disk cache size (in MB)

SQUID_DIRECTIVES_ONLY

The contents of squid.conf will only be what's defined in SQUID_DIRECTIVES giving the user full control of squid.

SQUID_DIRECTIVES

This will append any contents of the environment variable to squid.conf. It is expected that you will use multi-line block quote for the contents.

Here is an example:

docker run -d \
    -e SQUID_DIRECTIVES="
    # hi ho hi ho
    # we're doing block I/O
    # hi ho hi ho
    " jpetazzo/squid-in-a-can

Persistent Cache

Being docker when the instance exits the cached content immediately goes away when the instance stops. To avoid this you can use a mounted volume. The cache location is /var/cache/squid3 so if you mount that as a volume you can get persistent caching. Use -v /home/user/persistent_squid_cache:/var/cache/squid3 in your command line to enable persistent caching.

If you do that, make sure that the persistent_squid_cache directory is writable by the right user. As I write these lines, the squid process runs as user and group proxy, and their UID and GID both are 13; so make sure that the directory is writable by UID 13, or by GID 13, or (if you really can't make otherwise) world-writable (but please don't).

Note that if you're using Docker Mac, all volume I/O is handled by the Docker Mac application, which runs as an ordinary process; so you won't have to deal with permissions as long as you have read/write access to a volume.

Notes

Ideas for improvement:

  • easy chaining to an upstream proxy

HTTPS support

It has been asked if this could support HTTPS. HTTPS is designed to prevent man-in-the middle attacks, and a transparent proxy is effectively a MITM. If you want to use squid for HTTPS proxying transparently you need to setup a private CA certificate and push it to all your users so they trust the proxy. An example of how to set this up can be found here.

Without a CA certificate configured, the default behavior is to tunnel HTTPS traffic using the CONNECT method. Squid makes the request on behalf of the client but cannot decrypt or cache the requests or responses.

squid-in-a-can's People

Contributors

drmoose avatar fkautz avatar hayderimran7 avatar jpetazzo avatar kscherer avatar ruffsl avatar terje2001 avatar tfoote avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

squid-in-a-can's Issues

Permission problems with persistent cache

I was trying to mount a cache volume into my container, but got this error:

squid_1 | 2014/12/01 06:13:12 kid1| Making directories in /var/cache/squid3/00
squid_1 | FATAL: Failed to make swap directory /var/cache/squid3/00/00: (13) Permission denied

The directories get permissions of my user (on my host) and docker can't create the subdirectories...

$ ls -la tmp/squid3/
total 0
drwxrwxrwx+ 3 tobias  staff  102 Dec  1 23:52 .
drwxr-xr-x+ 3 tobias  staff  102 Dec  1 23:43 ..
drwxr-xr-x+ 2 tobias  staff   68 Dec  1 23:52 00

Running Docker 1.3.1 on OS X 10.10.

[addon]
I was able to get it running on OS X (via docker host vm) by manually creating /var/cache/squid3/00 to /var/cache/squid3/0F and running chmod -R 777 path/to/volume.

You cannot link containers when using net=host

  1. You cannot link containers when using net=host\
  2. Also deploy_squid.py should be chmod +x inside dockerfile, otherwise you will get permission denied
  3. After all those changes I cannot download nothing I got following error on the squid side :
    TCP_MISS_ABORTED/000 0 GET http://ftp.freepark.org/pub/linux/distributions/centos/6.6/updates/x86_64/Packages/bind-libs-9.8.2-0.30.rc1.el6_6.1.x86_64.rpm

In case I set env http_proxy inside container It works like a charm.

Thanks,

Starting the container with its own namespace

In theory policy routing could be used (instead of DNAT), and squid should not have problems with this (would not refuse such requests). This solution is described here:

I tried to implement it but could not make it work. This solution seems simple but it becomes too complicated in case that it is applied to docker containers, because docker makes host iptables a mess.
Finally I gave up and made up my mind that there are no benefits of starting the container with its own namespace, especially because the implementation is much more complicated.

squid-in-a-can does not work on Docker 1.4

After upgrading docker to 1.4, I'm having following message:

$ fig up -d squid && fig run tproxy
Creating squidinacangit_squid_1...
Conflicting options: --net=host can't be used with links. This would result in undefined behavior.

ufs is blocking

I've been playing around with this (good work btw) and optimised a version of it for caching Steam's CDN downloads. I've found that UFS is a bit pants (it blocks execution of the main squid process when performing IO). I would suggest that you use aufs instead (I did and it more-than-doubled the throughput for the proxy).

pip installs not cached

Hi, love this docker container, definitely saved me lots of bandwidth during my development but unfortunately when the requirements.txt file is executed during builds 'pip' installs do not seem to be cached.
Please can someone let me know if it is possible to enable this?
Many thanks :)

HTTPS

Is it possible to enable this with https too?.

Heaps of docker containers are trying to connect to things via https. So with a corporate proxy they become somewhat useless.

Can't to start and configure squid dockerized container on Ubuntu 14.04

I'm trying to run your dockerized solution for caching with transparent squid but I can't do this and I'm trying to understand how it works and I have some questions

$ sudo docker info
Containers: 0
Images: 177
Storage Driver: aufs
 Root Dir: /home/docker/aufs
 Dirs: 177
Execution Driver: native-0.2
Kernel Version: 3.13.0-43-generic
Operating System: Ubuntu 14.04.1 LTS
CPUs: 2
Total Memory: 7.685 GiB
Name: dv-laptop
ID: DOAN:37FJ:5QLG:PHOW:OHCF:6OOS:2OGN:XSZK:5Q3C:W2FV:EI4X:NWB5
WARNING: No swap limit support
$ sudo docker version 
Client version: 1.4.1
Client API version: 1.16
Go version (client): go1.3.3
Git commit (client): 5bc2ff8
OS/Arch (client): linux/amd64
Server version: 1.4.1
Server API version: 1.16
Go version (server): go1.3.3
Git commit (server): 5bc2ff8

I understood that squid container works in host network namespace and 3129-th port of container will able in host machine for using for transparent proxying HTTP traffic for all containers. All HTTP container's traffic will preroute through caching proxy with iptable rule.

I'm using more clearly way without addition container for change host's iptable rules (i.e. run "squid-in-i-can" container and directly update iptables preroute rule).

But I see discrepancy in ports: nmap of 127.0.0.1 show my squid 3128 port and preroute assign the 3129 port. I read your comment about patch debian config and I seen the adding 3129 port in squid config in Dockerfile, but nmap can't mistake too.

And my new simple container can't use HTTP traffic when squid container is running and preroute configured. I'm trying to replace port 3129 to 3128 in preroute rule, but no result.

My steps:

Run squid container:

sudo docker run \
    --rm \
    --name squid \
    -e DISK_CACHE_SIZE:5120 \
    -e MAX_CACHE_OBJECT:1024 \
    -v /path/to/my/cache:/var/cache/squid3 \
    --net host \
    jpetazzo/squid-in-a-can

My nmap localhost result

$ sudo nmap localhost

Starting Nmap 6.40 ( http://nmap.org ) at 2015-01-14 11:07 EET
Nmap scan report for localhost (127.0.0.1)
Host is up (0.000015s latency).
Not shown: 996 closed ports
PORT     STATE SERVICE
53/tcp   open  domain
631/tcp  open  ipp
3128/tcp open  squid-http
9091/tcp open  xmltec-xmlmail

Nmap done: 1 IP address (1 host up) scanned in 2.42 seconds

I see this situation on ipdatables

$ sudo iptables --list -t nat
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination         
DOCKER     all  --  anywhere             anywhere             ADDRTYPE match dst-type LOCAL
REDIRECT   tcp  --  anywhere             anywhere             tcp dpt:http redir ports 3129

Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
DOCKER     all  --  anywhere            !127.0.0.0/8          ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination         
MASQUERADE  all  --  172.17.0.0/16        anywhere            

Chain DOCKER (2 references)
target     prot opt source               destination

I run simple container and try to update my apt's cache:

$ sudo docker run --rm -it tianon/debian-debootstrap:jessie /bin/bash
root@6d159e962991:/# apt-get update
Err http://http.debian.net jessie InRelease                                                           

Err http://http.debian.net jessie-updates InRelease                                                   

Err http://http.debian.net jessie Release.gpg                                                         
  Cannot initiate the connection to http.debian.net:80 (2001:41c8:1000:21::21:35). - connect (101: Network is unreachable) [IP: 2001:41c8:1000:21::21:35 80]
Err http://http.debian.net jessie-updates Release.gpg  
  Cannot initiate the connection to http.debian.net:80 (2001:41c8:1000:21::21:35). - connect (101: Network is unreachable) [IP: 2001:41c8:1000:21::21:35 80]
0% [Connecting to security.debian.org (212.211.132.32)]

Internet link in container is enable

root@6d159e962991:/# ping google.com -c 4
PING google.com (173.194.116.232): 56 data bytes
64 bytes from 173.194.116.232: icmp_seq=0 ttl=55 time=49.587 ms
64 bytes from 173.194.116.232: icmp_seq=1 ttl=55 time=49.346 ms
64 bytes from 173.194.116.232: icmp_seq=2 ttl=55 time=50.718 ms
64 bytes from 173.194.116.232: icmp_seq=3 ttl=55 time=49.451 ms
--- google.com ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max/stddev = 49.346/49.776/50.718/0.551 ms

but nothing with HTTP ๐Ÿ˜’

What am I doing wrong?

Add a few lines in squid.conf

Readme says:

SQUID_DIRECTIVES
This will append any contents of the environment variable to squid.conf. It is expected that you will use multi-line block quote for the contents.

And I want to redirect squid to another squid (http://wiki.squid-cache.org/Features/CacheHierarchy) which translates to something like:

cache_peer parentcache.foo.com parent 3128 0 no-query default
never_direct allow all

I've tried setting SQUID_DIRECTIVES with double quotes, single quotes, with \n, \n,... but it doesn't work with multiline, it always store the variable in the squid configuration file as a single line:

docker run --name squid-in-a-can --net host -d -e="cache_peer parentcache.foo.com parent 3128 0 no-query default \n never_direct allow all" jpetazzo/squid-in-a-can

Cashing for the host

If one needs HTTP caching for the host (I have needed it badly many times but could not find a suitable solution), it can be enabled by this command:

iptables -t nat -A OUTPUT -p tcp --dport 80 -j REDIRECT --to 3129 -w

It can be disabled with this one (replacing -A with -D):

iptables -t nat -D OUTPUT -p tcp --dport 80 -j REDIRECT --to 3129 -w

issues when the host is running SELinux

Not advocating SELinux here, but

$ sudo docker run --net host -v /var/squid3/cache:/var/squid3/cache jpetazzo/squid-in-a-can 
squid3: error while loading shared libraries: cannot restore segment prot after reloc: Permission denied
Setting MAXIMUM_OBJECT_SIZE 1024
Setting DISK_CACHE_SIZE 5000
Traceback (most recent call last):
  File "/tmp/deploy_squid.py", line 72, in <module>
    sys.exit(main())
  File "/tmp/deploy_squid.py", line 54, in main
    subprocess.check_call(build_cmd, shell=True)
  File "/usr/lib/python2.7/subprocess.py", line 540, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'squid3 -z' returned non-zero exit status 127

Setting permissive mode on the host let's the container start up.

Default rule in Fedora 20 rejects with icmp-host-prohibited

While using this "awesome" containers to ease my development I stumbled upon a problem that I could only solve when I commented out this rules in my Fedora 20 (I haven touched iptables at all, so should be there by default):

-A FORWARD -j REJECT --reject-with icmp-host-prohibited
-A INPUT -j REJECT --reject-with icmp-host-prohibited

(Not sure which one of those, as not an expert in Networking or iptables).

[jmorales@localhost Desktop]$ docker version
Client version: 1.3.0
Client API version: 1.15
Go version (client): go1.3.3
Git commit (client): c78088f/1.3.0
OS/Arch (client): linux/amd64
Server version: 1.3.0
Server API version: 1.15
Go version (server): go1.3.3
Git commit (server): c78088f/1.3.0

Unable to restart/stop squid3

When doing service stop squid3 or service restart squid3, I get the error

[FAIL] Stopping Squid HTTP Proxy 3.x: squid3[....]  Waiting.......................................................................... failed.

Squid3 is running on a server with 2GB of free swap space. Why is this happening?

use this as local proxy to filter ads

My knowledge about proxy configuration is limited, but I wonder if this concept can be use to make the Internet Browser connect through this proxy and filter all kind of nasty ads and such?
Thank you!

I need further clarification as to how squid in a can operates with the default settings

I have a couple of containers running on a server, the containers are in bridge mode networking to my knowledge, if I start squid in a can with the default options will those containers use the proxy?

Should I move these containers to their own 'private network' and add squid in a can to that network aswell as host or is that impossible? I'd like to do that to ensure that they cannot get to the internet excepting for using the proxy to do so?

Also what is the default size of the cache? Disk space is low on our server and i'd really like to implement this in a secure way that doesn't chew up too much disk space so I can stop hammering our sources while debugging our scripts.....(EDIT just re read the documentation here and noticed its 5000mb)

Does "iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to 3129 -w" get executed on the 'client' containers?

Any help/info appreciated and if i'm being and idiot please feel free to tell me! :-)

Re: easy chaining to an upstream proxy

Hi,

Not an issue, but I saw your comment for a potential improvement,
Launching with the following commands forces traffic to round robin between a pair of upstream proxies

Jay

docker run --net host -d
-e SQUID_DIRECTIVES="
prefer_direct off
nonhierarchical_direct off
cache_peer 10.0.0.6 parent 8080 0 no-query default no-digest weight=1
cache_peer 10.0.0.7 parent 8080 0 no-query default no-digest weight=1
" jpetazzo/squid-in-a-can

DockerHub builds are failing

Looks like project structure is different, so for the past 2 years, builds are not working.
So I assume that the image I can get from docker hub is 3yo, right?

Best
Lovato

more elaborative working example for noobs, please!

Thanks for the brilliant concept for saving very expensive Internet bandwidth.

But i have a big issue since i am noob to docker , i could not set up your idea properly.

i started by saying :/ sudo docker run --net host jpetazzo/squid-in-a-can

2015/05/30 09:16:25 kid1| Set Current Directory to /var/spool/squid3
2015/05/30 09:16:25 kid1| Creating missing swap directories
2015/05/30 09:16:25 kid1| /var/cache/squid3 exists
2015/05/30 09:16:25 kid1| Making directories in /var/cache/squid3/00
2015/05/30 09:16:25 kid1| Making directories in /var/cache/squid3/01
2015/05/30 09:16:25 kid1| Making directories in /var/cache/squid3/02
2015/05/30 09:16:25 kid1| Making directories in /var/cache/squid3/03
2015/05/30 09:16:25 kid1| Making directories in /var/cache/squid3/04
2015/05/30 09:16:25 kid1| Making directories in /var/cache/squid3/05
2015/05/30 09:16:25 kid1| Making directories in /var/cache/squid3/06
2015/05/30 09:16:25 kid1| Making directories in /var/cache/squid3/07
2015/05/30 09:16:25 kid1| Making directories in /var/cache/squid3/08
2015/05/30 09:16:25 kid1| Making directories in /var/cache/squid3/09
2015/05/30 09:16:25 kid1| Making directories in /var/cache/squid3/0A
2015/05/30 09:16:25 kid1| Making directories in /var/cache/squid3/0B
2015/05/30 09:16:25 kid1| Making directories in /var/cache/squid3/0C
2015/05/30 09:16:25 kid1| Making directories in /var/cache/squid3/0D
2015/05/30 09:16:25 kid1| Making directories in /var/cache/squid3/0E
2015/05/30 09:16:25 kid1| Making directories in /var/cache/squid3/0F

and i have no room to say :// iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to 3129 -w.

i wonder whether this is docker related issue, but i thought to ask starting from here. :)

So i cordially ask you to provide a more noob friendly guide here , many noobs like me will benefit from your brilliant concept.

Many Thanks

Docker's v1.11 networking braking use of iptables method?

I'm using the docker-compose up to launch the transparent squid proxy, but as soon as I do, I lose access to the outside net through port 80, i.e. apt-get update will hang. Reverting the iptables changes using

iptables -t nat -D PREROUTING -p tcp --dport 80 -j REDIRECT --to 3129 -w 

also does not revert the issue, nor restarting the docker service. I've only just found rebooting reverts this behavior. I had multiple appends to the same chain as the tproxy script encountered a hard stop.

$ docker --version 
Docker version 1.11.1, build 5604cbe

DockerHub Builds are failing

Looks like project structure is different, so for the past 2 years, builds are not working.
So I assume that the image I can get from docker hub is 3yo, right?

Best
Lovato

Bower-Npm Behind Corporate Proxy

I could not get more information on google group, stackoverflow. You are my last resource.
https://groups.google.com/forum/#!topic/docker-user/vc3PF0aRjYQ

Im am behind a corporate proxy. I have a squid running on 3128. Locally - in the host - if I run npm or bower everything goes trhough the proxy and works as expected.

I want to run bower update or npm update inside the running container. Basically the traffic should be run through docker which should in return run via the squid proxy.

I looked at https://docs.docker.com/installation/ubuntulinux/ and reading on forums, the instruction is to update /etc/default/docker to export the proxy setup.

 export http_proxy="http://127.0.0.1:3128/"
 export https_proxy="http://127.0.0.1:3128/"
 export HTTP_PROXY="http://127.0.0.1:3128/"
 export HTTPS_PROXY="http://127.0.0.1:3128/"

Then we restart/start docker

 sudo service docker start

Inside a container, if I run 'apt-get', npm install, bower install I cant get through.
I think this docker container should fix my problem but I have not been succesful so far.

Thanks - Santiago

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.