Help with HTTP Intercepting Proxy in Ruby? - html

I have the beginnings of an HTTP Intercepting Proxy written in Ruby:
require 'socket' # Get sockets from stdlib
server = TCPServer.open(8080) # Socket to listen on port 8080
loop { # Servers run forever
Thread.start(server.accept) do |client|
puts "** Got connection!"
#output = ""
#host = ""
#port = 80
while line = client.gets
line.chomp!
if (line =~ /^(GET|CONNECT) .*(\.com|\.net):(.*) (HTTP\/1.1|HTTP\/1.0)$/)
#port = $3
elsif (line =~ /^Host: (.*)$/ && #host == "")
#host = $1
end
print line + "\n"
#output += line + "\n"
# This *may* cause problems with not getting full requests,
# but without this, the loop never returns.
break if line == ""
end
if (#host != "")
puts "** Got host! (#{#host}:#{#port})"
out = TCPSocket.open(#host, #port)
puts "** Got destination!"
out.print(#output)
while line = out.gets
line.chomp!
if (line =~ /^<proxyinfo>.*<\/proxyinfo>$/)
# Logic is done here.
end
print line + "\n"
client.print(line + "\n")
end
out.close
end
client.close
end
}
This simple proxy that I made parses the destination out of the HTTP request, then reads the HTTP response and performs logic based on special HTML tags. The proxy works for the most part, but seems to have trouble dealing with binary data and HTTPS connections.
How can I fix these problems?

First, you would probably be better off building on an existing Ruby HTTP proxy implementation. One such is already available in the Ruby standard library, namely WEBrick::HTTPProxyServer. See for example this related question for an implementation based on that same class: Webrick transparent proxy.
Regarding proxying HTTPS, you can't do much more than just pass the raw bytes. As HTTPS is cryptographically protected, you cannot inspect the contents at the HTTP protocol level. It is just an opaque stream of bytes.

WEBrick is blocking I/O ... This mean it does not able to stream the response. For example if you go on a youtube page to see a video, the stream will not be forwarded to your browser until the proxy have downloaded all the video cotent.
If you want the video be played in your browser during it download, you have to look for a non blocking I/O solution like EventMachine.
For HTTPS the solution is a little bit complicated since you have to develop a man in the middle proxy.

This was an old question, but for the sake of completeness here goes another answer.
I've implemented a HTTP/HTTPS interception proxy in Ruby, the project is hosted in github.
The HTTP case is obvious, HTTPS interception in accomplished via an HTTPS server that acts as a reverse proxy (and handles the TLS handshake). I.e.
Client(e.g. Browser) <--> Proxy1 <--> HTTPS Reverse Proxy <--> Target Server
As Valko mentioned, when a client connects to a HTTPS server through a proxy, you'll see a stream of encrypted bytes (since SSL provides end-to-end encryption). But not everything is encrypted, the proxy needs to know to whom the stream of bytes should be forwarded, so the client issues a CONNECT host:port request (being the body of the request the SSL stream).
The trick here is that the first proxy will forward this request to the HTTPS Reverse Proxy instead of the real target server. This reverse proxy will handle the SSL negotiation with the client, have access to the decrypted requests, and send copies (optionally altered versions) of these requests to the real target server by acting as a normal client. It will get the responses from the target server, (optionally) alter the responses, and send them back to the client.

Related

Varnish: Multiple IPs compare to ACL using Tilde

What would happen in Varnish if multiple IPs are in an X-Forward-For header which is compared to an ACL using the tilde operator?
Dummy example:
The request has the following HTTP header:
X-Forward-For: 160.12.34.56, 10.10.10.10
The Varnish config looks like this:
acl internal {
"10.10.10.10"
}
if ((std.ip(req.http.X-Forward.For, "0.0.0.0") ~ internal)){
# THIS CODE
}
else {
# OR THIS CODE
}
Which code block is executed?
Also, does the order of the IPs matter in the X-Forward-For header?
Does it change if there are 2 X-Forward-For headers, each with one of the two IPs?
Will it work?
The short answer to your question is no, it won't work.
std.ip() expects to receive a single IP address, not a collection. The conversion will fail, and the fallback value (second argument of the function) will be returned.
Here's a quick test script that illustrates this:
vcl 4.0;
import std;
backend default none;
sub vcl_recv {
set req.http.x-f = "1.2.3.4, 5.6.7.8";
return(synth(200,std.ip(req.http.x-f,"0.0.0.0")));
}
This example will return 0.0.0.0.
Does X-Forwarded-For need multiple IP addresses?
It does make sense to ask the question if your X-Forwarded-For header needs multiple IP addresses.
The idea is to indicate to the origin server what the IP address of the original client was.
In your case there is more than 1 proxy in front of the webserver, so a natural reaction is to chain the IP addresses in the X-Forwarded-For header.
A better solution would be to figure out what the IP address of the original client was, and set that value in X-Forwarded-For.
The best way to get this done is by leveraging the PROXY protocol, which Varnish supports.
Leverage the PROXY protocol
The PROXY protocol has the capability of transporting the HTTP protocol, but additionally keep track of the connection parameters of the original client.
Varnish supports this and allows you to set an extra listening port that listens for PROXY requests.
Here's an example of how you can start varnishd with PROXY support:
varnishd -a :80 -a :8443,PROXY -f /etc/varnish/default.vcl -s malloc,256m
As you can see, port 80 is still available for regular HTTP, but port 8443 was allocated for PROXY support.
If the proxy servers in front of Varnish support PROXY, Varnish will take the value from the original client and automatically set X-Forwarded-For with that value.
This way you always know who the client was, and you can safely perform your ACL check.
Additionally, there's also a PROXY module for Varnish, that can give you information about potential TLS termination that took place in front of Varnish.

Python 3.4 Sockets sendall function

import socket
def functions():
print ("hello")
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server_address = ('192.168.137.1', 20000)
sock.bind(server_address)
sock.listen(1)
conn, addr = sock.accept()
print ('Connected by', addr)
sock.listen(1)
conn.sendall(b"Welcome to the server")
My question is how to send a function to the client,
I know that conn.sendall(b"Welcome to the server") will data to the client.
Which can be decoded.
I would like to know how to send a function to a client like
conn.sendall(function()) - this does not work
Also I would like to know the function that would allow the client to receive the function I am sending
I have looked on the python website for a function that could do this but I have not found one.
The functionality requested by you is principally impossible unless explicitly coded on client side. If this were possible, one could write a virus which easily spreads into any remote machine. Instead, this is client right responsibility to decode incoming data in any manner.
Considering a case client really wants to receive a code to execute, the issue is that code shall be represented in a form which, at the same time,
is detached from server context and its specifics, and can be serialized and executed at any place
allows secure execution in a kind of sandbox, because a very rare client will allow arbitrary server code to do anything at the client side.
The latter is extremely complex topic; you can read any WWW browser security history - most of closed vulnerabilities are of issues in such sandboxing.
(There are environments when such execution is allowed and desired; e.g. Erlang cookie-based peering cluster. But, in such cluster, side B is also allowed to execute anything at side A.)
You should start with searching an execution environment (high-level virtual machine) which conforms to your needs in functionality and security. For Python, you'd look at multiprocessing module: its implementation of worker pools doesn't pass the code itself, but simplifies passing data for execution requests. Also, passing of arbitrary Python data without functions is covered with marshal and pickle modules.

WebSocket permessage-deflate in Chrome with no context takeover

I have this problem with compression, and I am not sure if it is a bug. My WebSocket server does not support context takeover, and I am having problems sending messages, but not receiving.
The browser issues a request like this:
GET /socket HTTP/1.1
Host: thirdparty.com
Origin: http://example.com
Connection: Upgrade
Upgrade: websocket
Sec-WebSocket-Version: 13
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Extensions: permessage-deflate; client_max_window_bits, x-webkit-deflate-frame
If the server does not specify any option about context takeover:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Access-Control-Allow-Origin: http://example.com
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
Sec-WebSocket-Extensions: permessage-deflate
I can read and write the first message, but cannot do subsequent reads or writes, because Chrome expects the server is keeping the context.
So my server provides this answer:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Access-Control-Allow-Origin: http://example.com
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
Sec-WebSocket-Extensions: permessage-deflate; client_no_context_takeover; server_no_context_takeover
And now I can receive messages without problems, but again, I can only send the first message, the second message fails, and I see an error in Chrome saying that it failed at inflating the frame. I tried to send two identical strings, and I can see how the server is sending twice the same data, but the client fails to decompress it the second time.
So it seems that Chrome accepts the client_no_context_takeover parameter that specify that the client won't use the same compression context for all messages when compressing, but ignores server_no_context_takeover indicates the server won't use the same context.
Is this a bug in Chrome? I am not clear about if I can send options back that have not been offered/requested by the client.
Is there any other option I can use to disable the client context takeover?
UPDATE:
In WebSocketPerMessageDeflate.cpp in the Chromium source code, I can see:
if (clientNoContextTakeover != parameters.end()) {
if (!clientNoContextTakeover->value.isNull()) {
m_failureReason = "Received invalid client_no_context_takeover parameter";
return false;
}
mode = WebSocketDeflater::DoNotTakeOverContext;
++numProcessedParameters;
}
But also:
if (serverNoContextTakeover != parameters.end()) {
if (!serverNoContextTakeover->value.isNull()) {
m_failureReason = "Received invalid server_no_context_takeover parameter";
return false;
}
++numProcessedParameters;
}
In the first snippet, it is setting the "mode" variable, but in the second one is not doing nothing, so it seems it is basically ignoring the parameter.
Cheers.
A server must send a server_no_context_takeover parameter in a response only if the client requested the "no context takeover". In essence, the server acknowledges the client's request.
If a server decides to do "no context takeover" for sending on it's own (without the client having requested it), that's fine. In this case, no parameter is sent by the server.
A deflate sender can always on it's own drop compression context and/or reduce compression window size. There is no need to tell the receiver. The deflate wire format has enough information for the receiver to cope with that.
Here is how configuration and handshake looks with Crossbar.io.
I finally found the problem.
https://datatracker.ietf.org/doc/html/draft-ietf-hybi-permessage-compression-17#section-8.2.3
8.2.3.4. Using a DEFLATE Block with BFINAL Set to 1
Going through the examples in the draft I found my server was sending slightly different payloads. Turned out that the problem was the BFINAL, I need to set it to 0 by adding a 0 byte at the end.
Now it works.

Game + Web Server using ExpressJS

I'm currently trying to develop a simple Flash game which talks to a node.js server.
My question is this:
How might I go about making a server which differentiates web requests from game requests?
Here are the details of what I've done:
Previously, I used the net and static modules to handle requests from the game client and the browser, respectively.
TwoServers.js
// Web server
var file = new staticModule.Server('./public');
http.createServer(function(req, res){
req.addListener('end', function(){
file.serve(req, res, function(err, result){
// do something
});
});
}).listen(port1, "127.0.0.1");
// Game Server
var server = net.createServer(function(socket)
{
// handle messages to/from Flash client
socket.setEncoding('utf8');
socket.write('foo');
socket.on('data', onMessageReceived);
});
server.listen(port2, "127.0.0.1");
I'd like to do the above with just an Express server listening in on a single port, but I'm not sure how to go about doing that.
Here's what I'm thinking it might look like (doesn't actually work):
OneServer.js
var app = express();
app.configure(function()
{
// ...
app.use('/',express.static(path.join(__dirname, 'public'))); // The static server
});
app.get('/', function(req, res) // This is incorrect (expects http requests)
{
// Handle messages to/from Flash client
var socket = req.connection;
socket.setEncoding('utf8');
socket.write('foo');
socket.on('data', onMessageReceived);
});
app.listen(app.get('port')); // Listen in on a single port
But I'd like to be able to differentiate from web page requests and requests from the game.
Note: Actionscript's XMLSocket makes TCP requests, so using app.get('/') is incorrect for two reasons:
When Flash writes to the socket, it isn't using the http protocol, so app.get('/') will not be fired when the game tries to connect.
Since I don't have access to correct the net.Socket object, I cannot expect to be reading or writing from/to the correct socket. Instead, I'll be reading/writing from/to the socket associated with the web page requests.
Any help on this would be much appreciated (especially if I'm reasoning about this the wrong way).
When a TCP connection is opened to a given port, the server (Node + Express) has no way of telling who made that connection (whether it's a browser or your custom client).
Therefore, your custom client must speak HTTP if it wishes to communicate with the Express server sitting on port 80. Otherwise, the data you send over a freshly opened socket (in your custom protocol) will just look like garbage to Express, and it will close the connection.
However, this doesn't mean you can't get a TCP stream to speak a custom protocol over – you just have to speak HTTP first and ask to switch protocols. HTTP provides a mechanism exactly to accomplish this (the Upgrade header), and in fact it is how WebSockets are implemented.
When your Flash client first opens a TCP connection to your server, it should send: (note line breaks MUST be sent as CRLF characters, aka \r\n)
GET /gamesocket HTTP/1.1
Upgrade: x-my-custom-protocol/1.0
Host: example.com
Cache-Control: no-cache
​
The value of Upgrade is your choice, Host MUST be sent for all HTTP requests, and the Cache-Control header ensures no intermediate proxies service this request. Notice the blank line, which indicates the request is complete.
The server responds:
HTTP/1.1 101 Switching Protocols
Upgrade: x-my-custom-protocol/1.0
Connection: Upgrade
​
Again, a blank line indicates the headers are complete, and after that final CRLF, you are now free to send any data you like in any format over the TCP connection.
To implement the server side of this:
app.get('/gamesocket', function(req, res) {
if (req.get('Upgrade') == 'x-my-custom-protocol/1.0') {
res.writeHead(101, { Upgrade: req.get('Upgrade'), Connection: 'Upgrade' });
// `req.connection` is the raw net.Socket object
req.connection.removeAllListeners(); // make sure Express doesn't listen to the data anymore... we've got it from here!
// now you can do whatever with the socket
req.connection.setEncoding('utf8');
req.connection.write('foo');
req.connection.on('data', onMessageReceived);
} else res.send(400); // bad request
});
Of course, remember that TCP is not a message-based protocol, it only provides a stream, and thus the data events of a Socket can either fragment a single logical message into multiple events or even include several logical messages in a single event. Be prepared to manually buffer data.
Your other option here is to use socket.io, which implements a WebSockets server plus its own protocol on top of the WebSockets protocol. The WebSockets protocol is message-based. It mostly works just like I've outlined here, and then after HTTP negotiation adds a message framing layer on top of the TCP connection so that the application doesn't have to worry about the data stream. (Using WebSockets also opens the possibility of connecting to your server from a HTML page if necessary.)
There is a Flash socket.io client available.

Server response gets cut off half way through

I have a REST API that returns json responses. Sometimes (and what seems to be at completely random), the json response gets cut off half-way through. So the returned json string looks like:
...route_short_name":"135","route_long_name":"Secte // end of response
I'm pretty sure it's not an encoding issue because the cut off point keeps changing position, depending on the json string that's returned. I haven't found a particular response size either for which the cut off happens (I've seen 65kb not get cut off, whereas 40kbs would).
Looking at the response header when the cut off does happen:
{
"Cache-Control" = "must-revalidate, private, max-age=0";
Connection = "keep-alive";
"Content-Type" = "application/json; charset=utf-8";
Date = "Fri, 11 May 2012 19:58:36 GMT";
Etag = "\"f36e55529c131f9c043b01e965e5f291\"";
Server = "nginx/1.0.14";
"Transfer-Encoding" = Identity;
"X-Rack-Cache" = miss;
"X-Runtime" = "0.739158";
"X-UA-Compatible" = "IE=Edge,chrome=1";
}
Doesn't ring a bell either. Anyone?
I had the same problem:
Nginx cut off some responses from the FastCGI backend. For example, I couldn't generate a proper SQL backup from PhpMyAdmin. I checked the logs and found this:
2012/10/15 02:28:14 [crit] 16443#0: *14534527 open()
"/usr/local/nginx/fastcgi_temp/4/81/0000004814" failed (13: Permission
denied) while reading upstream, client: *, server: , request:
"POST / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host:
"", referrer: "http://*/server_export.php?token=**"
All I had to do to fix it was to give proper permissions to the /usr/local/nginx/fastcgi_temp folder, as well as client_body_temp.
Fixed!
Thanks a lot samvermette, your Question & Answer put me on the right track.
Looked up my nginx error.log file and found the following:
13870 open() "/var/lib/nginx/tmp/proxy/9/00/0000000009" failed (13: Permission denied) while reading upstream...
Looks like nginx's proxy was trying to save the response content (passed in by thin) to a file. It only does so when the response size exceeds proxy_buffers (64kb by default on 64 bits platform). So in the end the bug was connected to my request response size.
I ended fixing my issue by setting proxy_buffering to off in my nginx config file, instead of upping proxy_buffers or fixing the file permission issue.
Still not sure about the purpose of nginx's buffer. I'd appreciate if anyone could add up on that. Is disabling the buffering completely a bad idea?
I had similar problem with cutting response from server.
It happened only when I added json header before returning response header('Content-type: application/json');
In my case gzip caused the issue.
I solved it by specifying gzip_types in nginx.conf and adding application/json to list before turning on gzip:
gzip_types text/plain text/html text/css application/x-javascript text/xml application/xml application/xml+rss text/javascript application/json;
gzip on;
It's possible you ran out of inodes, which prevents NginX from using the fastcgi_temp directory properly.
Try df -i and if you have 0% inodes free, that's a problem.
Try find /tmp -mtime 10 (older than 10 days) to see what might be filling up your disk.
Or maybe it's another directory with too many files. For example, go to /home/www-data/example.com and count the files:
find . -print | wc -l
Thanks for the question and the great answers, it saved me a lot of time. In the end, the answer of clement and sam helped me solve my issue, so the credits go to them.
Just wanted to point out that after reading a bit about the topic, it seems it is not recommended to disable proxy_buffering since it could make your server stall if the clients (user of your system) have a bad internet connection for example.
I found this discussion very useful to understand more.
The example of Francis Daly made it very clear for me:
Perhaps it is easier to think of the full process as a chain of processes.
web browser talks to nginx, over a 1 MB/s link.
nginx talks to upstream server, over a 100 MB/s link.
upstream server returns 100 MB of content to nginx.
nginx returns 100 MB of content to web browser.
With proxy_buffering on, nginx can hold the whole 100 MB, so the
nginx-upstream connection can be closed after 1 s, and then nginx can
spend 100 s sending the content to the web browser.
With proxy_buffering off, nginx can only take the content from upstream at
the same rate that nginx can send it to the web browser.
The web browser doesn't care about the difference -- it still takes 100
s for it to get the whole content.
nginx doesn't care much about the difference -- it still takes 100 s to
feed the content to the browser, but it does have to hold the connection
to upstream open for an extra 99 s.
Upstream does care about the difference -- what could have taken it 1
s actually takes 100 s; and for the extra 99 s, that upstream server is
not serving any other requests.
Usually: the nginx-upstream link is faster than the browser-nginx link;
and upstream is more "heavyweight" than nginx; so it is prudent to let
upstream finish processing as quickly as possible.
We had a similar problem. It was caused by our REST server (DropWizard) having SO_LINGER enabled. Under load DropWizard was disconnecting NGINX before it had a chance to flush it's buffers. The JSON was >8kb and the front end would receive it truncated.
I've also had this issue – JSON parsing client-side was faulty, the response was being cut off or worse still, the response was stale and was read from some random memory buffer.
After going through some guides – Serving Static Content Via POST From Nginx as well as Nginx: Fix to “405 Not Allowed” when using POST serving static while trying to configure nginx to serve a simple JSON file.
In my case, I had to use:
max_ranges 0;
so that the browser doesn't get any funny ideas when nginx adds Accept-Ranges: bytes in the response header) as well as
sendfile off;
in my server block for the proxy which serves the static files. Adding it to the location block which would finally serve the found JSON file didn't help.
Another protip for serving static JSONs would also be not forgetting the response type:
charset_types application/json;
default_type application/json;
charset utf-8;
Other searches yielded folder permission issues – nginx is cutting the end of dynamic pages and cache it or proxy buffering issues – Getting a chunked request through nginx, but that was not my case.