Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Why is my tcp not reliable? (2009) (netherlabs.nl)
131 points by CarolineW on April 1, 2017 | hide | past | favorite | 14 comments


It's a Berkeleyism. It's not TCP, it's old Berkeley Sockets semantics. The TCP protocol supports half-close. When you're done writing you're supposed to close your write side, then read until you get an EOF. But the Berkeley people didn't support that.

Linux does support it, with the "shutdown(2)" call.[1] When you're done writing, you shut down the write side. The other end sees an EOF, and they close. You read to EOF, and you close. If shutdown and close return normally, all data was delivered. Assuming this was implemented right.

[1] http://man7.org/linux/man-pages/man2/shutdown.2.html


It's a Berkeleyism. It's not TCP, it's old Berkeley Sockets semantics. [...] Linux does support it, with the "shutdown(2)" call.

The shutdown(2) system call was added to BSD on January 8th, 1983: https://svnweb.freebsd.org/csrg?view=revision&revision=10208

So... Berkeley Sockets supported this a bit more than 8 years before Linux even existed. This is totally not a Berkeley Sockets problem.


What you describe (the shutdown technique) is exactly what my linked blogpost describes, by the way. If this is the fault of BSD, or the sockets API definition is indeed an interesting question. Is RFC 1122 TCP/IP or a BSD document? Anyhow, use the shutdown technique or even better, explicit length or chunked.


So if we could go back in time, could we create an interface that is less ridden with pitfalls?


The interface could be better, but depending on what you're doing, you likely want a protocol ack, not a network ack anyway. Ex: If you send a file, you probably want to know that the file write succeeded, not just that the program that's supposed to write the file received it.

In today's world of janky TCP optimizing middleboxes, an ack just means you don't need to resend it, not that it arrived at the kernel of your peer; and it never meant the program at your peer read it.


Do we need to go back in time? Can't we just start now, in 2017, to build something better than what they finalized on back in the spring of 1978? Don't we know more now than we did then? What are the forces that hold us back? If those forces are political, can we fight them?


There are already things like SCTP.



The nice thing is that we have applications that give us this. E.g. this is literally how ssh(1) works when used as a transport layer. (You have stdin, stdout, and also stderr). The only issue is some in-band signaling on stderr and error codes when SSH itself has problems.


I've always thought that this is ridiculous. Once a TCP socket reports that data has been sent, it should continue working to send it until every bytes has been acked (with the exception of an unrecoverable failure case). Also, I wish there were an API to detect how much data still hasn't been acked.


> I wish there were an API to detect how much data still hasn't been acked.

From example code attached to the original article (Linux only):

  int outstanding;
  ioctl(fd, SIOCOUTQ, &outstanding);


You have to be pretty careful about what conclusions you draw from this info. It means relatively little about what your actual peer has seen. It tells you a some info about the channel between you and your peer, though.

If you need detailed feedback, you need to implement a feature that closes the loop.


I guess that select() already had a meaning so it couldn't be used for this?


Very interesting read. Thanks for sharing.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: