Why is writing a closed TCP socket worse than reading one? - language-agnostic

When you read a closed TCP socket you get a regular error, i.e. it either returns 0 indicating EOF or -1 and an error code in errno which can be printed with perror.
However, when you write a closed TCP socket the OS sends SIGPIPE to your app which will terminate the app if not caught.
Why is writing the closed TCP socket worse than reading it?

+1 To Greg Hewgill for leading my thought process in the correct direction to find the answer.
The real reason for SIGPIPE in both sockets and pipes is the filter idiom / pattern which applies to typical I/O in Unix systems.
Starting with pipes. Filter programs like grep typically write to STDOUT and read from STDIN, which may be redirected by the shell to a pipe. For example:
cat someVeryBigFile | grep foo | doSomeThingErrorProne
The shell when it forks and then exec's these programs probably uses the dup2 system call to redirect STDIN, STDOUT and STDERR to the appropriate pipes.
Since the filter program grep doesn't know and has no way of knowing that it's output has been redirected then the only way to tell it to stop writing to a broken pipe if doSomeThingErrorProne crashes is with a signal since return values of writes to STDOUT are rarely if ever checked.
The analog with sockets would be the inetd server taking the place of the shell.
As an example I assume you could turn grep into a network service which operates over TCP sockets. For example with inetd if you want to have a grep server on TCP port 8000 then add this to /etc/services:
grep 8000/tcp # grep server
Then add this to /etc/inetd.conf:
grep stream tcp nowait root /usr/bin/grep grep foo
Send SIGHUP to inetd and connect to port 8000 with telnet. This should cause inetd to fork, dup the socket onto STDIN, STDOUT and STDERR and then exec grep with foo as an argument. If you start typing lines into telnet grep will echo those lines which contain foo.
Now replace telnet with a program named ticker that for instance writes a stream of real time stock quotes to STDOUT and gets commands on STDIN. Someone telnets to port 8000 and types "start java" to get quotes for Sun Microsystems. Then they get up and go to lunch. telnet inexplicably crashes. If there was no SIGPIPE to send then ticker would keep sending quotes forever, never knowing that the process on the other end had crashed, and needlessly wasting system resources.

Usually if you're writing to a socket, you would expect the other end to be listening. This is sort of like a telephone call - if you're speaking, you wouldn't expect the other party to simply hang up the call.
If you're reading from a socket, then you're expecting the other end to either (a) send you something, or (b) close the socket. Situation (b) would happen if you've just sent something like a QUIT command to the other end.

Think of the socket as a big pipeline of data between the sending and the receiving process. Now imagine that the pipeline has a valve that is shut (the socket connection is closed).
If you're reading from the socket (trying to get something out of the pipe), there's no harm in trying to read something that isn't there; you just won't get any data out. In fact, you may, as you said, get an EOF, which is correct, as there's no more data to be read.
However, writing to this closed connection is another matter. Data won't go through, and you may wind up dropping some important communication on the floor. (You can't send water down a pipe with a closed valve; if you try, something will probably burst somewhere, or, at the very least, the back pressure will spray water all over the place.) That's why there's a more powerful tool to alert you to this condition, namely, the SIGPIPE signal.
You can always ignore or block the signal, but you do so at your own risk.

I think a large part of the answer is 'so that a socket behaves rather similarly to a classic Unix (anonymous) pipe'. Those also exhibit the same behaviour - witness the name of the signal.
So, then it is reasonable to ask why do pipes behave that way. Greg Hewgill's answer gives a summary of the situation.
Another way of looking at it is - what is the alternative? Should a 'read()' on a pipe with no writer give a SIGPIPE signal? The meaning of SIGPIPE would have to change from 'write on a pipe with noone to read it', of course, but that's trivial. There's no particular reason to think that it would be better; the EOF indication (zero bytes to read; zero bytes read) is a perfect description of the state of the pipe, and so the behaviour of read is good.
What about 'write()'? Well, an option would be to return the number of bytes written - zero. But that is not a good idea; it implies that the code should try again and maybe more bytes would be sent, which is not going to be the case. Another option would be an error - write() returns -1 and sets an appropriate errno. It isn't clear that there is one. EINVAL or EBADF are both inaccurate: the file descriptor is correct and open at this end (and should be closed after the failing write); there just isn't anything to read it. EPIPE means 'broken PIPE'; so, with a caveat about "this is a socket, not a pipe", it would be the appropriate error. It is probably the errno returned if you ignore SIGPIPE. It would be feasible to do this - just return an appropriate error when the pipe is broken (and never send the signal). However, it is an empirical fact that many programs do not pay as much attention to where their output is going, and if you pipe a command that will read a multi-gigabyte file into a process that quits after the first 20 KB, but it is not paying attention to the status of its writes, then it will take a long time to finish, and will be wasting machine effort while doing so, whereas by sending it a signal that it is not ignoring, it will stop quickly -- this is definitely advantageous. And you can get the error if you want it. So the signal sending has benefits to the o/s in the context of pipes; and sockets emulate pipes rather closely.
Interesting aside: while checking the message for SIGPIPE, I found the socket option:
#define SO_NOSIGPIPE 0x1022 /* APPLE: No SIGPIPE on EPIPE */

Related

Using write access in Open command in TCL

How can i use write ('w') and read ('r') access while using command pipeline in open command in TCL.
when i do something like :
set f1 [open "| ls -l" w]
it returns a file descriptor to write to , say file1.
Now I am confused how can I put this file descriptor to my use.
PS : My example might be wrong, and in that case it'd be ideal if answer includes a programming example so that it'll be more clear.
Thanks
In general, the key things you can do with a channel are write to it (using puts), read from it (using gets and read), and close it. Obviously, you can only write to it if it is writable, and only read from it if it is readable.
When you write to a channel that is implemented as a pipeline, you send data to the program on the other end of the pipe; that's usually consuming it as its standard input. Not all programs do that; ls is one of the ones that completely ignores its standard input.
But the other thing you can do, as I said above, is close the channel. When you close a pipeline, Tcl waits for all the subprocesses to terminate (if they haven't already) and collects their standard error output, which becomes an error message from close if there is anything. (The errors are just like those you can get from calling exec; the underlying machinery is shared.)
There's no real point in running ls in a pure writable pipeline, at least not unless you redirect its output. Its whole purpose is to produce output (the sorted list of files, together with extra details with the -l option). If you want to get the output, you'll need a readable channel (readable from the perspective of Tcl): open "| ls -l" r. Then you'll be able to use gets $f1 to read a line from the subprocess.
But since ls is entirely non-interactive and almost always has a very quick running time (unless your directories are huge or you pass the options to enable recursion), you might as well just use exec. This does not apply to other programs. Not necessarily anyway; you need to understand what's going on.
If you want to experiment with pipelines, try using sort -u as the subprocess. That takes input and produces output, and exhibits all sorts of annoying behaviour along the way! Understanding how to work with it will teach you a lot about how program automation can be tricky despite it really being very simple.

How to get immediate output in my IDE from a Tcl script?

I have just started to use the tcl language and I need to create a script with several functions triggered every 2 seconds. I have been searching an answer on the internet and found several topics about it. For instance, I found this code on StackOverlow (How do I use "after ms script" in TCL?):
#!/usr/bin/tclsh
proc async {countdown} {
puts $countdown
incr countdown -1
if {$countdown > 0} {
after 1000 "async $countdown"
} else {
after 1000 {puts Blastoff!; exit}
}
}
async 5
# Don't exit the program and let async requests
# finish.
vwait forever
This code could very easily be adapted to what I want to do but it doesn't work on my computer. When I copy paste it on my IDE, the code waits several second before giving all the outputs in one go.
I had the same problem with the other code I found on the internet.
It would be great if someone could help me.
Thanks a lot in advance.
I've just pasted the exact script that you gave into a tclsh (specifically 8.6) running in a terminal on macOS, and it works. I would anticipate that your script will work on any version from about Tcl 7.6 onwards, which is going back nearly 25 years.
It sounds instead like your IDE is somehow causing the output to be buffered. Within your Tcl script, you can probably fix that by either putting flush stdout after each puts call, or by the (much easier) option of putting this at the start of your script:
fconfigure stdout -buffering line
# Or do this if you're using partial line writes:
# fconfigure stdout -buffering none
The issue is that Tcl (in common with many other programs) detects whether its standard output is going to a terminal or some other destination (file or pipe or socket or …). When output is to a terminal, it sets the buffering mode to line and otherwise it is set to full. (By contrast, stderr always has none buffering by default so that whatever errors that occur make it out before a crash; there's nothing worse than losing debugging info by default.) When lots of output is being sent, it doesn't matter — the buffer is only a few kB long and this is a very good performance booster — but it's not what you want when only writing a very small amount at time. It sounds like the IDE is doing something (probably using a pipe) that's causing the guess to be wrong.
(The tcl_interactive global variable is formally unrelated; that's set when there's no script argument. The buffering rule applies even when you give a script as an argument.)
The truly correct way for the IDE to fix this, at least on POSIX systems, is for it to use virtual terminal to run scripts instead of a pipeline. But that's a much more complex topic!

Exception handling best practices inside gen_server module

I just started learning Erlang and this is a module from a test project of mine. I'm doing it so that I can understand a little better how the supervision tree works, to practice fail-fast code and some programming best practices.
The udp_listener process listens for UDP messages. It's role is to listen to communication requests from other hosts in the network and contact them through TCP using the port number defined in the UDP message.
The handle_info(...) function is called every time an UDP message is received by the socket, it decodes the UDP message and passes it to the tcp_client process.
From what I understood the only failure point in my code is the decode_udp_message(Data) called sometime inside handle_info(...).
When this functions fails, is the whole udp_listener process is restarted? Should I keep this from happening?
Shouldn't just the handle_info(...) function silently die without affecting the udp_listener process?
How should I log an exception on decode_udp_message(Data)? I would like to register somewhere the host and it's failed message.
-module(udp_listener).
-behaviour(gen_server).
-export([init/1, handle_call/3, handle_cast/2,
handle_info/2, terminate/2, code_change/3]).
%% ====================================================================
%% API functions
%% ====================================================================
-export([start_link/1]).
start_link(Port) ->
gen_server:start_link({local, ?MODULE}, ?MODULE, Port, []).
%% ====================================================================
%% Behavioural functions
%% ====================================================================
%% init/1
%% ====================================================================
-spec init(Port :: non_neg_integer()) -> Result when
Result :: {ok, Socket :: port()}
| {stop, Reason :: term()}.
%% ====================================================================
init(Port) ->
SocketTuple = gen_udp:open(Port, [binary, {active, true}]),
case SocketTuple of
{ok, Socket} -> {ok, Socket};
{error, eaddrinuse} -> {stop, udp_port_in_use};
{error, Reason} -> {stop, Reason}
end.
% Handles "!" messages from the socket
handle_info({udp, Socket, Host, _Port, Data}, State) -> Socket = State,
handle_ping(Host, Data),
{noreply, Socket}.
terminate(_Reason, State) -> Socket = State,
gen_udp:close(Socket).
handle_cast(_Request, State) -> {noreply, State}.
handle_call(_Request, _From, State) -> {noreply, State}.
code_change(_OldVsn, State, _Extra) -> {ok, State}.
%% ====================================================================
%% Internal functions
%% ====================================================================
handle_ping(Host, Data) ->
PortNumber = decode_udp_message(Data),
contact_host(Host, PortNumber).
decode_udp_message(Data) when is_binary(Data) ->
% First 16 bits == Port number
<<PortNumber:16>> = Data,
PortNumber.
contact_host(Host, PortNumber) ->
tcp_client:connect(Host, PortNumber).
Result
I've changed my code based on your answers, decode_udp_message is gone because handle_ping does what I need.
handle_ping(Host, <<PortNumber:16>>) ->
contact_host(Host, PortNumber);
handle_ping(Host, Data) ->
%% Here I'll log the invalid datagrams but the process won't be restarted
I like the way it is now, by adding the following code I could handle protocol changes in the future without losing backwards compatibility with old servers:
handle_ping(Host, <<PortNumber:16, Foo:8, Bar:32>>) ->
contact_host(Host, PortNumber, Foo, Bar);
handle_ping(Host, <<PortNumber:16>>) ->
...
#Samuel-Rivas
tcp_client is another gen_server with it's own supervisor, it will handle its own failures.
-> Socket = State in now only present in the terminate function. gen_udp:close(Socket). is easier on the eyes.
I think that "let it crash" has often been misinterpreted as "do not handle errors" (a much stronger and stranger suggestion). And the answer to your question ("should I handle errors or not") is "it depends".
One concern with error handling is the user experience. You're never going to want to throw a stack trace an supervision tree at you users. Another concern, as Samuel Rivas points out, is that debugging from just a crashed process can be painful (especially for a beginner).
Erlang's design favors servers with non-local clients. In this architecture, the clients must be able to handle the server suddenly becoming unavailable (your wifi connection drops just as you click the "post" button on S.O.), and the servers must be able to handle sudden drop-outs from clients. In this context, I would translate "let it crash" as "since all parties can handle the server vanishing and coming back, why not use that as the error handler? Instead of writing tons of lines of code to recover from all the edge-cases (and then still missing some), just drop all the connections and return to a known-good state."
The "it depends" comes in here. Maybe it's really important to you to know who sent the bad datagram (because you're also writing the clients). Maybe the clients always want a reply (hopefully not with UDP).
Personally, I begin by writing the "success path", which includes both successful success and also the errors that I want to show clients. Everything that I didn't think of or that clients don't need to know about is then handled by the process restarting.
Your decode_message is not the only point of failure. contact_host can most likely fail too, but you are either ignoring the error tuple or handling that failure in your tcp_client implementation.
That aside, you approach to error handling would work provided that your udp_listener is started by a supervisor with the correct strategy. If Data is not exactly 16 bits then the matching will fail and the process will crash with a badmatch exception. Then the supervisor will start a new one.
Many online style guides will advertise just that style. I think they are wrong. Even though failing there right away is just what you want, it doesn't mean you cannot provide a better reason than badmatch. So I would write some better error handling there. Usually, I would throw an informative tuple, but for gen servers that is tricky because they wrap every call within a catch which will turn throws into valid values. That is unfortunate, but is a topic for other long explanation, so for practical purposes I will throw errors here. A third alternative is just to use error tuples ({ok, Blah} | {error, Reason}), however that gets complicated fast. Which option to use is also topic for a long explanation/debate, so for now I'll just continue with my own approach.
Getting back to your code, if you want proper and informative error management, I would do something in this lines with the decode_udp_message function (preserving your current semantics, see at the end of this response, since I think they are not what you wanted):
decode_udp_message(<<PortNumber:16>>) ->
PortNumber;
decode_udp_message(Ohter) ->
%% You could log here if you want or live with the crash message if that is good enough for you
erlang:error({invalid_udp_message, {length, byte_size(Other)}}).
As you have said, this will take the entire UDP connection with it. If the process is restarted by a supervisor, then it will reconnect (which will probably cause problems unless you use the reuseaddr sockopt). That will be fine unless you are planning to fail many times per second and opening the connection becomes a burden. If that is the case you have several options.
Assume that you can control all your points of failure and handle errors there without crashing. For example, in this scenario you could just ignore malformed messages. This is probably fine in simple scenarios like this, but is unsafe as it is easy to overlook points of failure.
Separate the concerns that you want to keep fault tolerant. In this case I would have one process to hold the connections and another one to decode the messages. For the latter you could use a "decoding server" or spawn one per message depending on your preferences and the load you are expecting.
Summary:
Failing as soon as your code finds something outside what is the normal behaviour is a good idea, but remember to use supervisors to restore functionality
Just-let-it-crash, is a bad practice in my experience, you should strive for clear error reasons, that will make your life easier when your systems grow
Processes are your tool to isolate the scope for failure recovery, if you don't want one system to be affected by failures/restarts, just spawn off processes to handle the complexity that you want to isolate
Sometimes performance gets in the way and you'll need to compromise and handle errors in place instead of let processes crash, but as usual, avoid premature optimisation in this sense
Some notes about your code unrelated to error handling:
Your comment in the decode_udp_message seems to imply that you want to parse the first 16 bits, but you are actually forcing Data to be exactly 16 bits.
In some of your calls you do something like -> Socket = State, that indentation is probably bad style, and also the renaming of the variable is somewhat unnecessary. You can either just change State for Socket in the function head or, if you want to make clear that your state is a socket write your function head like ..., Socket = State) ->

Disabling Nagle's Algorithm under Action Script 3

I've been working with AS3 sockets, and I noticed that small packets are 'Nagled' when sent. I tried to find a way to set NoDelay for the socket, but I didn't find a clue even in the documentation. Is there another way to turn Nagle'a algorithm off in AS3 TCP sockets?
You can tell Flash to send out the data through the socket using the Flush method on the socket Object.
Flushes any accumulated data in the socket's output buffer.
That said, flash does what it thinks is the better, and may doesn't want to send your data too often. Still, that shouldn't be over few milliseconds.

What does End Of File on a socket mean?

Using Action Script 3 in Flex Builder 3.
When handling a SOCKET_DATA event, I occasionally, seemingly at random, get an "Error #2030: End of file was encountered." when calling socket.readInt(). I'm confused as to what this error means, since I'm not reading a file? I'm a little unfamilier with sockets. Thanks.
An end-of-file error typically means the other side of the socket has closed their connection, IIRC.
The reason it's end-of-file is that at a very low level within a program, a file on the disk and a socket are both represented with a number -- a file descriptor -- that the OS translates into the object representing a file or socket or pipe or whatever.
Usually, you can avoid this kind of error by checking if you just read in an EOF. If you did read an EOF, and you try reading from the socket/file again, then you will get an EOF error.
Update: According to the ActionScript 9.0 documentation, you do indeed get a close event if the other end closes the socket.
When reading off a socket, that is closed, you will get: Error #2002: Operation attempted on invalid socket.
End-of-file errors typically occur on any bytestreams, if you read more bytes than available. This is the case for files, sockets, etc. In the case of flash, it occurs when reading from a Socket or a ByteArray and maybe even in other cases.
TCP/IP is package based, but emulates a stream, thus you can only read the data off the stream, that was already sent to you with TCP packages. Check Socket::bytesAvailable to find out, how many bytes are currently available. Always keep in mind, that the data you write to the socket in one operation, may arrive in multiple packages, each very probably causing flash player to trigger socketData events.