A program I'm trying to run (for a small side project) keeps crashing. Well, “crashing” isn't the right term—it technically doesn't crash, but calls exit() when certain errors occur. The error in question happens with the following code:
>
```
x = fcntl(fd, F_GETFL, &fl);
if (x < 0)
{
syslog(LOG_ERR, "fcntl F_GETFL: FD %d: %s", fd, strerror(errno));
exit(1);
}
```
and the error in question is:
>
```
fcntl F_GETFL: FD -1: Bad file descriptor
```
It's in a function called set_nonblock() and it pretty much takes a file desriptor (reference to an open file) as a parameter and makes two calls to fcntl() and it's failing with an invalid file descriptor on the first call. So I check the code that calls set_nonblock(); there are only two locations were set_nonblock() is called, and in both cases, the file descriptor is checked before the call to set_nonblock() which means that the file descriptor is being clobbered between the initial test and the call.
Not good.
So I add more logging, and run again (mind you, this is over the course of several days).
I finally get a location:
>
```
stp.c:233: failed assertion newsock >= 0
```
Okay, check the code:
>
```
int wait_for_connection(int s)
{
int newsock;
int len;
struct sockaddr_in peer;
ddt(s > -1);
len = sizeof(struct sockaddr_in);
newsock = accept(s, (struct sockaddr *) &peer, &len);
/* dump_sockaddr (peer, len); */
if (newsock < 0) {
if (errno != EINTR)
perror("accept");
}
get_hinfo_from_sockaddr(peer, len, client_hostname);
**ddt(newsock >= 0);**
set_nonblock(newsock);
return (newsock);
}
```
Line 233 is highlighted, and ddt() (which is a function I wrote) basically checks the condition and if false, logs it (via syslog()) and exits the program. And I see the error. It's subtle, but it's there. The fragment:
>
```
newsock = accept(s, (struct sockaddr *) &peer, &len);
if (newsock < 0) {
if (errno != EINTR)
perror("accept");
}
```
is the culprit.
Under Unix, a system call (like accept()) can be interrupted, and if so, the call fails with an error code of EINTR. Why could a system call be interrupted? Well, say a program creates a child process (which this one does), and that child does its job and exits, then the parent process (which created the child process) is “interrupted” with a message: “your child process has finished.” Normally, if a system call is interrupted, you want to try the system call again, only this code doesn't do that! (although it looks like the author intended to recall accept() but forgot to write that code).
Patch the code:
>
```
int wait_for_connection(int s)
{
int newsock;
int len;
struct sockaddr_in peer;
ddt(s > -1);
do
{
len = sizeof(struct sockaddr_in);
newsock = accept(s,(struct sockaddr *) &peer,&len);
if (newsock < 0)
{
if (errno != EINTR)
{
perror("accept");
return(-1);
}
} while (newsock < 0);
get_hinfo_from_sockaddr(peer,sizeof(struct sockaddr_in),client_hostname);
set_nonblock(newsock);
return(newsock);
}
}
```
and try again. Hopefully, this (and some other minor cleanup) will fix the problem.