Jump to content

Task timeouts


Recommended Posts

I have a an upgrade task for the agent created that seems to be failing more frequently lately. The ERA console only indicates that the task is timing out on multiple workstations. I'm not finding much info to troubleshoot this.

 

Which specific logs would best indicate the issue and where are they located?

Link to comment
Share on other sites

  • ESET Staff

Most probable cause of "blocked" upgrade task we encountered recently is problem with AGENT shutdown. This may be caused by other tasks that may block service shutdown and also by processing huge amount of logs. Any chance you enabled diagnostic logs in security product? Task is considered as "timeouted" after 60 minutes if I recall correctly - have you checked whether AGENT is upgraded even after it is marked as failed on timeout?

In case upgrade process reached upgrade of AGENT itself, there should be specific MSIEXEC log located either in standard trace log folder or in temporary directory - this log may show when upgrade started and what time it was either interrupted or successful finished. Unfortunately I am currently not able to check it's exact name.

Link to comment
Share on other sites

Thanks for the reply.

 

It appears the task is starting on ERA server. Client task shows Starting and Starting Task, but no info in the client's trace log and no msi log.  It appears that task just isn't received by the clients.

 

I've verified the clients' info; correct name, IP address, previous agent version.  They show in the ERA console properly, etc. and have good connectivity. status.html shows all good information and no errors in the trace.log files.

 

Anything specific I can check on the server to tell why the commands/tasks are not received?

Link to comment
Share on other sites

  • ESET Staff

In case you see "Task started" or similar state logs in specific client's task execution history, task is properly transferred and started on client machine as these state reports are created on client machine.

I recommend to search for log I mentioned previously - in the meantime I found out that it should be name ra-upgrade-infrastructure.log and most probably located in %TEMP% of system account (AGENT service credentials), not current users - please check whole drive C:\ on client machine for this log to be sure.

Link to comment
Share on other sites

The ra-upgrade* file does not exist on the system(s). Nor does the msiexec log.  Windows Event Viewer shows no attempts to stop the Remote Agent service. Diagnostic logging is not enabled.

 

I also verified the the agent has not been upgraded and is still the previous version.

 

The task is timing out after exactly one hour.

 

I cleared out the Windows\Temp directory entirely and restarted the task. It is not generating any files there at all --I'd expect the agent msi file plus msi logs.

 

It appears the task is not starting at all on the client.

Link to comment
Share on other sites

  • ESET Staff

In this case I would recommend to enable full trace logging on this AGENT by configuration policy or even better by creating dummy traceAll file (see details in documentation) and restart. Once upgrade task is restarted, search for "Upgrade" in trace.log - you should see task execution progress. Basically this task will attempt to download AGENT installer from ESET repositories (in case it find suitable version) and run silent upgrade. There is possibility that download operation is stuck or there is unhanded error that could cause "silent" failure not detected by timeout handler.

Before you re-start upgrade task, please verify that there is no running msiexec.exe on client machine - this may also block installation as Windows does not enable multiple msi installation in parallel.

Link to comment
Share on other sites

I've tried a number of agent-based tasks (clone reset, deploy agent, upgrade agent, etc.) and all simply time out. Based on the logs on the clients, the task is not even hitting them.

 

Is there nothing logged on the server-side that would indicate the issue?

Link to comment
Share on other sites

  • ESET Staff

Is this AGENT actually connecting to SERVER - is last connection time updated in main clients view (or this specific client's details)?

 

I recommend to use "Export configuration" client task, which can be automatically created in client's configuration view (see Configuration subsection of related documentation article) - this task can almost never fail and it's execution is fast. Tasks you used are quiet special, and AGENT deployment is not even a client task. Also progress of reset cloned agent task will almost never be correct because AGENT will change it's identity during it's execution. Also all mentioned tasks requires AGENT service to be restarted, therefore they may be all blocked on this - which is also most probable reason why they are time-outing.

 

You can also try to enable full trace logging (documentation), run problematic client task and send me privately (PM) trace.log from client for further analysis.

Link to comment
Share on other sites

Yes, agent is definitely connecting, as it's running other tasks, (definitions updates) and reporting last connected time. Plus it's pulling proper system info, such as the application versions (so I know the agent needs to be upgraded), hardware manufacture, etc.

 

The Export config task runs, but nothing else, apparently

Link to comment
Share on other sites

So I've enabled full trace logging, but still nothing relevant is being logged. I can see the start module, applied policies info, but no errors.

 

From what I gather, the RA service is not working fully. These systems get updates, policies, etc. but won't run other tasks and don't even log those attempts. Restarting the service clears the issues and allows tasks to finally be run.  This is impacting a number of workstations, and making things fairly unreliable, since necessary tasks won't run until a system is restarted or we manually intervene.

 

On another note, I really feel there needs to be better logging and it needs to be centralized. Individual workstations may be offline, out of the office, etc. and jumping from one client to another, scrolling through logs (that in my case had no useful info) is tedious and time consuming. The purpose of the ERA server is for central management, so it makes sense that one should be able to do relevant troubleshooting from a central location.

Link to comment
Share on other sites

  • ESET Staff

So I've enabled full trace logging, but still nothing relevant is being logged. I can see the start module, applied policies info, but no errors.

 

From what I gather, the RA service is not working fully. These systems get updates, policies, etc. but won't run other tasks and don't even log those attempts. Restarting the service clears the issues and allows tasks to finally be run.  This is impacting a number of workstations, and making things fairly unreliable, since necessary tasks won't run until a system is restarted or we manually intervene.

 

On another note, I really feel there needs to be better logging and it needs to be centralized. Individual workstations may be offline, out of the office, etc. and jumping from one client to another, scrolling through logs (that in my case had no useful info) is tedious and time consuming. The purpose of the ERA server is for central management, so it makes sense that one should be able to do relevant troubleshooting from a central location.

 

Does this problem returns after service restart? From your description it is clear that task processed by the same "module" was stuck in execution and therefore other task were not processed. For example tasks from "Operating system" category are executed sequentially due to msinstaller limitations. If this was the case, what was first task that failed with timeout on client? We have been reported similar issue with upgrade infrastructure task, but in your case it seems that it has not even started and that is why I assume one of previously scheduled task blocked execution.

 

In case this problem happen again, we would appreciate if you could provide us "minidump" of ERAAgent.exe process from client machine (How to create minidump using Process Explorer) so that we can check what type of task and possibly also reason why it is stuck. Otherwise we could only guess as almost all client tasks from this category could possibly block.

Link to comment
Share on other sites

Does this problem returns after service restart?

In all cases, clients are connecting to ERA server, updating connection info, reporting correctly, updating virus DBs, etc. Just not running any tasks sent from the ERA console.

 

In all cases, trying to stop the ERA agent results in the service hanging in a 'stopping' state. I then have to use pskill to terminate the service successfully. The service restarts on its own, but I've found I have to manually stop it and restart it again for any tasks to run. After killing the process it stops and starts manually without issues and I can rerun the upgrade task successfully.

 

I have a dynamic group set up to catch all Windows systems that have agent installed with version < 6.3, then have Remote Agent Upgrade Task run automatically on these. Some run fine, but many fail.

 

I will try to generate some minidump files.

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...