It’s been a while since i last posted on the “hardcores or masochists” series. This is due to lack of time since an article like this one requires 4-5 hours of research and coding. But here i am again with a simple one. You must have heard about a “proxy server”. What it is, is what it actually means. It stands between you and the rest of the world for various protocols (http, ftp, ssl etc). In this small tutorial we will see how the HTTP proxy works and a small program example doing just that.
But let’s see in detail how the HTTP proxy should work. Below is the architecture of a proxy system.
The browser has to first be configured to send all the traffic through an http proxy. When that happens, the browser makes the request almost the same as it would do it to the end server with a small change. Below is the dump of a Firefox request on google.gr.
GET http://google.gr/ HTTP/1.1
Host: google.gr
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Proxy-Connection: keep-alive
Cookie: PREF=ID=6d527a9e2768fc73:TM=1213705662:LM=1213705662:S=pSyveN4XTqCHejq_
As you can see, the request is almost the same, as we saw it on a previous article here, with a small difference, “Proxy-Connection: keep-alive“. Following is an example of an IE request.
GET http://google.gr/ HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-ms-application, application/vnd.ms-xpsdocument, application/xaml+xml, application/x-ms-xbap, application/x-shockwave-flash, application/vnd.ms-excel,application/vnd.ms-powerpoint, application/msword, */*
Accept-Language: el
UA-CPU: x86
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.0.04506)
Host: google.gr
Proxy-Connection: Keep-Alive
It is basically the same. The “Proxy-Connection: keep-alive” header still exists. This is what makes us know that this is a request going through a proxy and not directly.
So if we wanted to create a small proxy server the basic procedure would be this:
A small Java example playing the role of a simple proxy server would be like the one below:
A few things we need to notice on the above code. First of all the “Proxy-Connection: keep-alive” header was not removed. This is because the code is a scratch just demonstrating how proxy works. If you want to make a good proxy you need to remove it. Even better construct a new request all over. You can do that by parsing the headers and reconstructing the request. Second thing you need to notice is that step 6 from the above list is completely missing. This is for the same reason. The code is merely an example. A good proxy server, for instance SQUID, should filter the incoming data and most probably cache them.
One small pointer to all the adventurous that will try to code a small proxy. Beware of what the “Accept:” header has because if in there there is “gzip” and the server supports gzip encoding then the data you will get will be compressed. So, do not try to echo them out on the console cause you will get alot of strange symbols
All in all, the proxy server is just a middle software that forwards requests. Here we discussed about the HTTP proxy, but proxies exist for FTP, SSL etc. Hope this small tutorial made things clear and gave you a good idea on what a proxy is.
cool
I like using proxies
but many sites block them
well with this way you can make it seamless that a proxy is in the way. i mean with the simple proxy i showed above you can construct a request just the same as a browser would do it…