Introduction

Fast browsing

Squid Architectures

Figure 1

 

No squid

The first example shows a setup without a squid proxy. Every object needs to be fetched from the net, or from the local browser cache. The latency involved, however, means that this tends to be the slowest of the bunch.

One squid

The one squid allows for a faster browsing due to the fact that all the network latency is absorbed by the squid host, so to speak. If squid can pipeline the information from the remote host, remote network latency is minimized. Even better results are achived by running an adzapper on the proxy; the adzapper will remove advertisements, freeing up bandwidth for more useful things such as content.

Prefetching squid

The variation here is to create a prefetching squid. In this architecture, there are two squid proxies: squid_main, which is the squid proxy that the local browsers connect to, and squid_back, which performs all the back-end processing.

When a browser requests a URL from squid_main, the resource(s) pointed to by that URL are preloaded into squid_back by a squid redirector process. The redirector process in this case is a script that runs wget on the URL in question that pulls down all the page dependencies. In the ideal case, this means that every object needed to display the web page will be in squid_back before squid_main begins processing the URL.

squid_main then loads those preloaded objects from squid_back via a cache_parent relationship.

Just like in the previous case, and adzapper can be run on squid_back, improving response even more.

Unoffical test on the prefetching squid seem to show that it's perceptibly faster than a single squid, though I haven't done much formal testingon it.

Configuration

For the Prefetching squid configuration, you need to build two squids. You can't share binaries due to some kind of hard codin in the source. Create two copies of the source distribution, named squid_main and squid_back. For best results, create a user and group named squid and chown all the appropriate binaries and directories.

You will also need wget installed!

Optionally, try using the user-agent hacked redirection code located here. This allows wget to use the browser's user-agent instead of WGET.

cd source
gunzip -c squid-2.4.STABLE6-src.tar.gz | tar -xvf -
mv squid-2.4.STABLE6 squid_main
cp -r squid_main squid_back

Then use the following to configure them. Be sure to adjust the directory paths to your particular setup. I use /opt/servers for server applications, and /opt/data for runtime or variable data.

squid_back
cd squid_back
./configure  --prefix=/opt/servers/squid_back --sharedstatedir=/opt/data/squid_back \
  --localstatedir=/opt/data/squid_back --enable-dlmalloc --enable-async-io \
  --disable-http-violations --disable-ident-lookups
make install

edit /opt/servers/squid_back/etc/squid.conf
set the http_port to 3129, or some other port
set the icp port to 3130 or some other port
set your acls as normal
add adzap to the redirector
set squid config options as normal
[ optional install adzapper ]
create cache dirs using 'squid -z'
start squid
test
squid_main
cd ../squid_main
./configure  --prefix=/opt/servers/squid_main --sharedstatedir=/opt/data/squid_main \
  --localstatedir=/opt/data/squid_main --enable-dlmalloc --enable-async-io \
  --disable-http-violations --disable-ident-lookups
make install
set the http_port to 3128, or some other port
set the icp_port to 3131, must be different that squid_back
set the cache_peer to squid_back
cache_peer 10.163.3.1 parent 3129 3130 default proxy-only
# default, so all accesses go through squid_back
# proxy-only, because squid_back has the objects
# parent, becuase this always gets from squid_back

set your acls as normal
set squid config options as normal
then set the redirector program:
redirect_program /opt/servers/squid_main/main_redirector
redirect_children 15

main_redirector
#!/usr/bin/perl
#

$| = 1;

@endings = qw/ \.gif$ \.jpg$ \.jpeg$ \.png$ \.gz$ \.img$ \.tgz$ \.sit$ \.hqx$ \.image$ \.bin$ \.Z$ \.TAZ$ \.exe$ \.zip$ \.avi$ \.mov$ \.mp3$ \.mpg$ \.mpeg$ \.mov$ \.pkg$ \.wav$ \.dmg$ \.aiff$ \.pdf$ \.iso$ \.toast$ \.bz2$ \.js$ \.css$ /;
while (<>)
{
    ($url, $addr, $fqdn, $ident, $method) = m:(\S*) (\S*)/(\S*) (\S*) (\S*):;
        if ($method eq "GET")
        {
                $stop = 0;
                foreach $ending (@endings)
                {
                        if (($url =~ /$ending/i))
                        {
                                $stop = 1;
                                break;
                        }
                }
                if ($stop == 0)
                {
                        system("/opt/servers/squid_main/preget $url");
                }
        }
        print "\n";
}
# EOF

preget
#!/bin/sh
#
# note: change 10.163.3.1:3129 to the IP and port of squid_back!
# also /tmp/targ must exist so wget stuff can be dropped there.
#
http_proxy="http://10.163.3.1:3129/"
export http_proxy
#echo "pregetting $1" >> /opt/data/squid_main/logs/myurls
wget -q --level=1 -nd  --directory-prefix=/tmp/targ/ -p "$1"
rm -rf /tmp/targ/* &
# EOF

squid -z to create the cache

install wget, and make sure its in the path!
create /tmp/targ for the temporary wget files
test squid