Skip to content

socket handling

This could be about ipv6, secure sockets etc... Try to gather documents about that here too.

for now some problems

bad file descriptor

This happens shortly after you stop a program listening on a socket (or it crashed) and you start it up again.

If you don't give it time enough it will start , but any connect will fail and give the bad filedescriptor and a crash of the server.

It is explained here

https://stackoverflow.com/questions/3229860/what-is-the-meaning-of-so-reuseaddr-setsockopt-option-linux

Socket connection fail

This error can occur in multiple ways, but it seems to always crash the broker/server (not the clients).

Connection closed
accept(): Bad file descriptor
error:00000000:lib(0)::reason(0)Nope.. no accept

This one is actually more to the point :

Connection to database failed: connection to server at "127.0.0.1", port 5432 failed: could not create socket: Too many open files

Less frequently :

"Could Not Translate db-postgres to Address"
which is probably also because there was no socket available.

So it is indeed too many files opened, you can check this with the server and worker running. The worker opens a connection every few seconds.

ps -elf | grep worker # to find the pid
netstat --all --program | grep <pid>

If you see the list growing, you need to close a socket after you are done. You can test if this is the cause by lowering the user file limit:

ulimit -n 10

Note that this is very low, 5 won't even start up. Also you can't seem to set it higher again! (just start a new shell and it will be 1024 again).

Now the error occurs after about 5 connects. The broker code was fixed by-the-way and now it keeps on running with this low setting, meaning we also have no other problems in other services. Try this ulimit again after the full system is running, on all servers and clients as well.