Difference between revisions of "TCP proxy with netcat"

From Noah.org
Jump to navigationJump to search
m
m
 
(4 intermediate revisions by the same user not shown)
Line 5: Line 5:
  
  
This is a simple proxy for HTTP. Note that most web proxies are configured in the web browser to automatically handle subsequent requests. The proxy shown here does not follow HTTP proxy protocol. This is called a '''transparent proxy'''.
+
This is a simple proxy for HTTP. This is a '''transparent proxy'''. It does not follow the '''HTTP/1.1 CONNECT''' method spec for proxies.  It just bounces lines of text back and forth. Many protocols will not work properly when treated like this. True web proxies work seamlessly with the web server and web browser to automatically and cleanly handle passing requests back and forth. With the following command you might be suspicious that this works because FIFOs are line oriented and there are no provisions for handling the socket. If any of the connection breaks this entire command pipeline will simply exit back to the shell.
 
<pre>
 
<pre>
 
mkfifo /tmp/fifo
 
mkfifo /tmp/fifo
Line 23: Line 23:
 
</pre>
 
</pre>
  
This version attempts to do a very unsophisticated rewrite of the HTML so that subsequent requests will continue to come back through the proxy (note the URL is rewritten to the results of '''hostname -f'''). It also deletes request headers that would normally affect proxies. It deliberately circumvents normal headers used to control proxy connections. So this is a improper HTTP proxy. It is also not very reliable. It tends to hang and get stuck. I am not sure why.
+
This version attempts to do a very unsophisticated rewrite of the HTML so that subsequent requests will continue to come back through the proxy (note the URL is rewritten to the results of '''${HOSTNAME}''). The URL rewriting attempts to handle URLs with and without the '''www.''' simply by assuming they both map to the same proxy. It also deletes request headers that would normally affect proxies. It deletes the '''Accept-Encoding''' request header to prevent compression of the response by the server (most web servers will gzip responses). It deliberately circumvents normal headers used to control proxy connections. So this is a improper HTTP proxy. It is also not very reliable. It tends to hang and get stuck or quit when either the client or server closes their end of the connection. I believe this is caused by the '''FIFO''', which can not TCP control signals, so after a while the two sides get out of sync... At this point things are getting pretty sketchy, and it's amazing that this even works at all.
 
<pre>
 
<pre>
 
mkfifo /tmp/fifo
 
mkfifo /tmp/fifo
 
nc -q -1 -l -p 8080 </tmp/fifo \
 
nc -q -1 -l -p 8080 </tmp/fifo \
     | sed -u -e 's/^Host:.*/Host: www.noah.org/' -e '/^Connection:.*/d' -e '/^If-None-Match:.*/d' -e '/^If-Modified-Since:.*/d' -e '/^Accept-Encoding:.*/d' \
+
     | sed -u -e "s/^Host:.*/Host: www.noah.org/" -e "/^Accept-Encoding:.*/d" -e "/^Connection:.*/d" -e "/^If-None-Match:.*/d" -e "/^If-Modified-Since:.*/d" \
 
     | tee -i -a http_request.log \
 
     | tee -i -a http_request.log \
 
     | nc -q -1 www.noah.org 80 \
 
     | nc -q -1 www.noah.org 80 \
     | sed -u -e 's/noah.org/$(hostname -f)/i' \
+
     | sed -u -e "s/www.noah.org/${HOSTNAME}:8080/ig" \
 +
    | sed -u -e "s/noah.org/${HOSTNAME}:8080/ig" \
 
     | tee -i -a http_response.log >/tmp/fifo
 
     | tee -i -a http_response.log >/tmp/fifo
 
</pre>
 
</pre>

Latest revision as of 02:15, 1 June 2014


This article shows uses of netcat to demonstrate a few simple proxies.


This is a simple proxy for HTTP. This is a transparent proxy. It does not follow the HTTP/1.1 CONNECT method spec for proxies. It just bounces lines of text back and forth. Many protocols will not work properly when treated like this. True web proxies work seamlessly with the web server and web browser to automatically and cleanly handle passing requests back and forth. With the following command you might be suspicious that this works because FIFOs are line oriented and there are no provisions for handling the socket. If any of the connection breaks this entire command pipeline will simply exit back to the shell.

mkfifo /tmp/fifo
nc -lk -p 8080 </tmp/fifo | nc www.noah.org 80 >/tmp/fifo

Note that this will not work on virtual web sites. Web servers use the Host request header field to determine which virtual web site to serve. If Host is not set correctly then the web server will return an error like this.

Site Temporarily Unavailable
We apologize for the inconvenience. Please contact the webmaster/ tech support immediately to have them rectify this.
error id: "bad_httpd_conf"

The simple transparent proxy is not smart enough to handle HTTP traffic. The following HTTP proxy will rewrite the Host: field in the HTTP request header to support virtual web sites. This version also adds logging of the client request and server response. Note that this does not rewrite HTML responses so the links in the web page will still point to the original web site, so subsequent requests made by clicking links in the web page will not go through the proxy connection.

mkfifo /tmp/fifo
nc -lk -p 8080 </tmp/fifo | sed -u -e 's/^Host.*/Host: www.noah.org/' | tee -a http_request.log | nc www.noah.org 80 | tee -a http_response.log >/tmp/fifo

This version attempts to do a very unsophisticated rewrite of the HTML so that subsequent requests will continue to come back through the proxy (note the URL is rewritten to the results of ${HOSTNAME}). The URL rewriting attempts to handle URLs with and without the www.' simply by assuming they both map to the same proxy. It also deletes request headers that would normally affect proxies. It deletes the Accept-Encoding request header to prevent compression of the response by the server (most web servers will gzip responses). It deliberately circumvents normal headers used to control proxy connections. So this is a improper HTTP proxy. It is also not very reliable. It tends to hang and get stuck or quit when either the client or server closes their end of the connection. I believe this is caused by the FIFO, which can not TCP control signals, so after a while the two sides get out of sync... At this point things are getting pretty sketchy, and it's amazing that this even works at all.

mkfifo /tmp/fifo
nc -q -1 -l -p 8080 </tmp/fifo \
    | sed -u -e "s/^Host:.*/Host: www.noah.org/" -e "/^Accept-Encoding:.*/d" -e "/^Connection:.*/d" -e "/^If-None-Match:.*/d" -e "/^If-Modified-Since:.*/d" \
    | tee -i -a http_request.log \
    | nc -q -1 www.noah.org 80 \
    | sed -u -e "s/www.noah.org/${HOSTNAME}:8080/ig" \
    | sed -u -e "s/noah.org/${HOSTNAME}:8080/ig" \
    | tee -i -a http_response.log >/tmp/fifo