Jump to content
Dmitry

SMC7: The maximum number of open file descriptors is reached, Agent v7 can't access ESMC

Recommended Posts

We have ESMC v7.0.471.0 installed as VirtualApplience on Hyper-V Server.

All was good while we have less then 1200 client agents

After installing new agents (total is about 1600) we have a problem with ESMC indicating in Dashboard like "The maximum number of open file descriptors is reached"

About 400 to 500 client agents can not replicate with ESMC server.

Restartig Eraserver or rebooting is helping for some time clients can replicate. After 30-45min the problem is coming back.

Client agents replication time is 30min

Agents version 7.0.577.0

 

SMC client agent trace,log:

2019-02-11 15:49:39 Error: CUpdatesModule [Thread f1c]: PerformUpdate: Module update failed with error: Could not connect to server. (error code 8449)
2019-02-11 17:49:39 Error: CUpdatesModule [Thread 1008]: PerformUpdate: Module update failed with error: Could not connect to server. (error code 8449)
2019-02-11 19:49:39 Error: CUpdatesModule [Thread 14bc]: PerformUpdate: Module update failed with error: Could not connect to server. (error code 8449)
2019-02-11 21:13:26 Warning: CPushNotificationsModule [Thread 1328]: Failed to configure EPNS resource (retrying in 21600 seconds): Error calling PNS API 'PnsRegisterClient' (return code = 19108)
2019-02-11 21:49:39 Error: CUpdatesModule [Thread 16dc]: PerformUpdate: Module update failed with error: Could not connect to server. (error code 8449)
2019-02-11 23:49:39 Error: CUpdatesModule [Thread 131c]: PerformUpdate: Module update failed with error: Could not connect to server. (error code 8449)
2019-02-12 01:49:39 Error: CUpdatesModule [Thread 1448]: PerformUpdate: Module update failed with error: Could not connect to server. (error code 8449)
2019-02-12 03:13:31 Warning: CPushNotificationsModule [Thread c5c]: Failed to configure EPNS resource (retrying in 21600 seconds): Error calling PNS API 'PnsRegisterClient' (return code = 19108)
2019-02-12 03:15:07 Error: CReplicationModule [Thread 810]: CAgentReplicationManager: Replication finished unsuccessfully with message: SendRequestAndHandleResponse: Rpc message response INTERNAL_ERROR. Error message: Replication details: [Task: CReplicationConsistencyTask, Scenario: Automatic replication (REGULAR), Connection: serverFQDN:2222, Connection established: true, Replication inconsistency detected: false, Server busy state detected: false, Realm change detected: false, Realm uuid: a93b727c-a591-43e1-9690-400475f29f4e, Sent logs: 0, Cached static objects: 63, Cached static object groups: 9, Static objects to save: 0, Static objects to delete: 0, Modified static objects: 0]
2019-02-12 03:45:07 Error: CReplicationModule [Thread 810]: InitializeConnection: Initiating replication connection to 'host: "serverFQDN" port: 2222' failed with: Request: Era.Common.Services.Replication.CheckReplicationConsistencyRequest on connection: host: "serverFQDN" port: 2222 with proxy set as: Proxy: Connection: :3128, Credentials: Name: , Password: ******, Enabled:0, EnabledFallback:1, failed with error code: 14, error message: OS Error, and error details: 
2019-02-12 03:45:07 Warning: CReplicationModule [Thread 810]: InitializeConnection: Not possible to establish any connection (Attempts: 1)
2019-02-12 03:45:07 Error: CReplicationModule [Thread 810]: InitializeFailOverScenario: Skipping fail-over scenario (stored replication link is the same as current)
2019-02-12 03:45:07 Error: CReplicationModule [Thread 810]: CAgentReplicationManager: Replication finished unsuccessfully with message: InitializeConnection: Initiating replication connection to 'host: "serverFQDN" port: 2222' failed with: Request: Era.Common.Services.Replication.CheckReplicationConsistencyRequest on connection: host: "serverFQDN" port: 2222 with proxy set as: Proxy: Connection: :3128, Credentials: Name: , Password: ******, Enabled:0, EnabledFallback:1, failed with error code: 14, error message:  OS Error, and error details: Replication details: [Task: CReplicationConsistencyTask, Scenario: Automatic replication (REGULAR), Connection: serverFQDN:2222, Connection established: false, Replication inconsistency detected: false, Server busy state detected: false, Realm change detected: false, Realm uuid: a93b727c-a591-43e1-9690-400475f29f4e, Sent logs: 0, Cached static objects: 63, Cached static object groups: 9, Static objects to save: 0, Static objects to delete: 0, Modified static objects: 0]
2019-02-12 03:49:39 Error: CUpdatesModule [Thread bac]: PerformUpdate: Module update failed with error: Could not connect to server. (error code 8449)

SMC client agent Status.html

ERROR: InitializeConnection: Initiating replication connection to 'host: "ServerFQDN" port: 2222' failed with: Request: Era.Common.Services.Replication.CheckReplicationConsistencyRequest on connection: host: "ServerFQDN" port: 2222 with proxy set as: Proxy: Connection: :3128, Credentials: Name: , Password: ******, Enabled:0, EnabledFallback:1, failed with error code: 14, error message: OS Error, and error details:

  • Replication details: [Task: CReplicationConsistencyTask, Scenario: Automatic replication (REGULAR), Connection: ServerFQDN:2222, Connection established: false, Replication inconsistency detected: false, Server busy state detected: false, Realm change detected: false, Realm uuid: a93b727c-a591-43e1-9690-400475f29f4e, Sent logs: 0, Cached static objects: 63, Cached static object groups: 9, Static objects to save: 0, Static objects to delete: 0, Modified static objects: 0]
  • All replication attempts: 101

I have tried next steps with no success:

  • Edit /etc/odbcinst.ini in section [MySQL ODBC 8.0 Unicode Driver] add:
    Threading = 0
  • restart

My.cnf
# For advice on how to change settings please see
# hxxp://dev.mysql.com/doc/refman/5.6/en/server-configuration-defaults.html
[mysqld]
# General configuration
innodb_buffer_pool_size = 3072M
join_buffer_size = 16M
sort_buffer_size = 2M
symbolic-links=0
sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
# Enable big chunks for ESET Remote Administrator
max_allowed_packet=33M
# Enable big statement size for ESET Remote Administrator
innodb_log_files_in_group=50
innodb_log_file_size=128MB
# Enable longer locks timeout for ESET Remote Administrator
innodb_lock_wait_timeout=3600
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

 

> cat /etc/odbcinst.ini
[PostgreSQL]
Description=ODBC for PostgreSQL
Driver=/usr/lib/psqlodbcw.so
Setup=/usr/lib/libodbcpsqlS.so
Driver64=/usr/lib64/psqlodbcw.so
Setup64=/usr/lib64/libodbcpsqlS.so
FileUsage=1
UsageCount=2

[MySQL ODBC 8.0 Unicode Driver]
Driver=/usr/lib64/libmyodbc8w.so
UsageCount=2
Threading=0

[MySQL ODBC 8.0 ANSI Driver]
Driver=/usr/lib64/libmyodbc8a.so
UsageCount=2

Edited by Dmitry

Share this post


Link to post
Share on other sites
Looks like I have found solution (at least it works for couple hours)

1) To find out number opened file descriptors and pid for some service execute:

lsof -p `pidof ERAServer | sed -re 's/ /,/g'` 

and count lines

(lsof program have to installed first with "yum install lsof")

2) To find out file limit for some service execute "cat /proc/13011/limits" (where 13011 is pid of runnig proccess)

In my case there was a soft limit 1024 for some services.2)

For changing limits:

3) Edit /etc/security/limits.conf.

add next lines (You could change number 8192 for what you like)::

*    soft    nofile 8192
*    hard    nofile 8192

4) This setting could be overiden for some services. So check /usr/lib/systemd/system/SOME_SERVICE.service and /etc/systemd/system/SOME_SERVICE.service (where SOME_SERVICE is service name)

add LimitNOFILE=8192 in [Service] section (You could change number 8192 for what you like):

[Service]
...
LimitNOFILE=8192

In my case I modified httpd.service and eraserver.service

4) run:

systemctl daemon-reload

5) restart services or reboot server to changes take effect

 

 

Edited by Dmitry

Share this post


Link to post
Share on other sites
14 hours ago, Dmitry said:

We have ESMC v7.0.471.0 installed as VirtualApplience on Hyper-V Server.

Could you please provide more details of appliance version you are using? Have you installed it from scratch (i.e. downloaded version 7.0.72.0) or you had appliance deployed previously and you only updated ESMC component installed in it? Asking because we have targeted this issue in latest appliance (7.0.72.0).

Share this post


Link to post
Share on other sites
12 hours ago, MartinK said:

Could you please provide more details of appliance version you are using? Have you installed it from scratch (i.e. downloaded version 7.0.72.0) or you had appliance deployed previously and you only updated ESMC component installed in it? Asking because we have targeted this issue in latest appliance (7.0.72.0).

Appliance was deployed on MS Win Srv 2016 Hyper-V in October 2018 from zip`ed vhd. Vhd file in zip archive date modified is 28/06/2018 

Share this post


Link to post
Share on other sites
13 hours ago, Dmitry said:

Appliance was deployed on MS Win Srv 2016 Hyper-V in October 2018 from zip`ed vhd. Vhd file in zip archive date modified is 28/06/2018 

Thanks, that means your appliance (VHD) comes from release 7.0.66.0, not latest 7.0.72.0 where we targeted this issue.

We have decided to resolve this issue by changing default limits for all systemd services, it can be done by commands:

sed -i "s/.*DefaultLimitNOFILE=.*/DefaultLimitNOFILE=65535 /" /etc/systemd/system.conf
sed -i "s/.*DefaultLimitNOFILE=.*/DefaultLimitNOFILE=65535 /" /etc/systemd/user.conf
Modification of service file (eraserver.conf) won't "survive" ESMC upgrade and file will be replaced with version bundled in installer.
 

 

Share this post


Link to post
Share on other sites
6 hours ago, MartinK said:

We have decided to resolve this issue by changing default limits for all systemd services, it can be done by commands:


sed -i "s/.*DefaultLimitNOFILE=.*/DefaultLimitNOFILE=65535 /" /etc/systemd/system.conf
sed -i "s/.*DefaultLimitNOFILE=.*/DefaultLimitNOFILE=65535 /" /etc/systemd/user.conf
Modification of service file (eraserver.conf) won't "survive" ESMC upgrade and file will be replaced with version bundled in installer.

Thank a lot.

 

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.

×