Jump to content

Recommended Posts

Posted (edited)

Hi,
 
I noticed that the server dies every 2 days or so.
The trace log shows these lines ever 10 seconds:
2016-04-23 22:51:39 Error: NetworkModule [Thread 7fbab4ff9700]: remote_endpoint: Bad file descriptor
2016-04-23 22:51:49 Error: NetworkModule [Thread 7fbab77fe700]: remote_endpoint: Bad file descriptor
2016-04-23 22:51:49 Error: NetworkModule [Thread 7fbab4ff9700]: remote_endpoint: Bad file descriptor
 
the last entries in the last-error.html:
 
 

Scope	Time	Text
NetworkModule	2016-Apr-22 17:09:32	remote_endpoint: Bad file descriptor
SchedulerModule	2016-Apr-22 17:09:35	Received message: RegisterSleepEvent
NetworkModule	2016-Apr-22 17:09:42	No descriptors available. Active session count:2
NetworkModule	2016-Apr-22 17:09:42	remote_endpoint: Bad file descriptor
NetworkModule	2016-Apr-22 17:09:52	No descriptors available. Active session count:2
NetworkModule	2016-Apr-22 17:09:52	remote_endpoint: Bad file descriptor
CCleanupModule	2016-Apr-22 17:09:55	Initiating calculation of status snapshots
CCleanupModule	2016-Apr-22 17:09:55	Finished calculation of status snapshots
SchedulerModule	2016-Apr-22 17:09:55	Received message: RegisterSleepEvent
AutomationModule	2016-Apr-22 17:10:00	ReportManager: ProcessWatchdogCheck: Watchdog reports normal behaviour.
ConsoleApiModule	2016-Apr-22 17:10:00	Session data cleanup timeout.
CReplicationModule	2016-Apr-22 17:10:00	CStepProcessor: Server state changed to OK
NetworkModule	2016-Apr-22 17:10:02	No descriptors available. Active session count:2
NetworkModule	2016-Apr-22 17:10:02	remote_endpoint: Bad file descriptor
NetworkModule	2016-Apr-22 17:10:02	Socket accepted. Remote ip address: 178.191.54.94 remote port: 55989
NetworkModule	2016-Apr-22 17:10:02	Resolving ip address: 178.191.54.94
NetworkModule	2016-Apr-22 17:10:02	Receiving ip address: 178.191.54.94 from cache
NetworkModule	2016-Apr-22 17:10:02	Successfully received ip address: 178.191.54.94 from cache
NetworkModule	2016-Apr-22 17:10:02	remote_endpoint: Transport endpoint is not connected
NetworkModule	2016-Apr-22 17:10:02	Socket accepted. Remote ip address: 213.47.170.12 remote port: 54350
NetworkModule	2016-Apr-22 17:10:02	Resolving ip address: 213.47.170.12
NetworkModule	2016-Apr-22 17:10:02	Receiving ip address: 213.47.170.12 from cache
NetworkModule	2016-Apr-22 17:10:02	Successfully received ip address: 213.47.170.12 from cache
NetworkModule	2016-Apr-22 17:10:02	Socket connection (isClientConnection:0) established for id 32342
NetworkModule	2016-Apr-22 17:10:02	Socket connection (isClientConnection:0) established for id 32343
NetworkModule	2016-Apr-22 17:10:02	Connection closed by remote peer for session id 32342
NetworkModule	2016-Apr-22 17:10:02	Connection closed by remote peer for session id 32343
NetworkModule	2016-Apr-22 17:10:02	Forcibly closing sessionId:32342, isClosing:0
NetworkModule	2016-Apr-22 17:10:02	Removing session 32342
NetworkModule	2016-Apr-22 17:10:02	Closing connection , session id:32342
NetworkModule	2016-Apr-22 17:10:02	Forcibly closing sessionId:32343, isClosing:0
NetworkModule	2016-Apr-22 17:10:02	Removing session 32343
NetworkModule	2016-Apr-22 17:10:02	Closing connection , session id:32343
NetworkModule	2016-Apr-22 17:10:03	No descriptors available. Active session count:2
NetworkModule	2016-Apr-22 17:10:03	remote_endpoint: Bad file descriptor
NetworkModule	2016-Apr-22 17:10:03	Socket accepted. Remote ip address: 213.47.170.12 remote port: 55358
NetworkModule	2016-Apr-22 17:10:03	Resolving ip address: 213.47.170.12
NetworkModule	2016-Apr-22 17:10:03	Receiving ip address: 213.47.170.12 from cache
NetworkModule	2016-Apr-22 17:10:03	Successfully received ip address: 213.47.170.12 from cache
NetworkModule	2016-Apr-22 17:10:03	remote_endpoint: Transport endpoint is not connected

 
I also notice a lot of the following warnings and errors in the tracelog:
 
2016-04-21 16:32:04 Warning: NetworkModule [Thread 7fbab7fff700]: The connection will be closed due to timeout. Resolved endpoint is NULL
2016-04-21 16:32:05 Error: NetworkModule [Thread 7fbab6ffd700]: Error reported by JobScheduler[Name:Dns job scheduler for not network operation]. Error message is:resolve: Host not found
 
Installed ERA Server 6.3.148.0
Agent on Server 6.3.148.0
Debian (64-bit), Version 8.4
 
I hope you can help me...
 
Kind regards

Edited by tobiasperschon
  • ESET Staff
Posted (edited)

Seems you have two (maybe related) issues. Seconds one you mention is problem with reverse DNS resolving that may lead to rejecting client connection. It is a known issue and you can use fixed libraries available from here (package contains also firs for other issue and I would suggest to replace both libraries).

 

Other problems are caused by network sockets exhaustion on your system. I would suggest to use command lsof to analyze what process and what type of file descriptos/sockets is causing this. In case it crashes after 2 days, I would wait at least day until running command, so that possible leak is visible.

Edited by MartinK
Posted (edited)

I applied the fix. I will check if the second problem is gone. I will also check the server if the first problem occurs again. Interestingly the server was running fine for over a year now. I apt-get upgraded it many times in this year but the first problem I mentioned just started recently. (I did not add more clients...)

 

anyway, we'll see and I will get back to you! thanks for the advice so far!

 

Update:

so far no more errors... lets see how long the server will keep running

 

Update2:

The patch seems to have fixed both issues. Server is running fine now.

Edited by tobiasperschon
Guest
This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...