BroCtl status/top take excessive amount of time


After running a large bro cluster for a few days on a FreeBSD system (FreeBSD 10.1, 28 physical nodes, 81 worker processes), broctl actions that interact with all nodes seem to take excessive amounts of time (>2 minutes for a broctl status). This was not the case right after starting up the cluster.

If there is any way I can help with more information, please let me know what to do.




Johanna Amann
March 25, 2015, 7:37 PM

I looked into this a tad more - and it seems that two nodes were very slow to reply and potentially ran into a timeout. That does not really seem obvious from the status output at the moment though (unless I completely missed it) - perhaps we should add that.

Johanna Amann
March 26, 2015, 2:46 PM

And even more detail - the cause of this was hardware problems on two nodes. The bro instances of these nodes were still kind-of-running, but I don't think they were communicating with master anymore and they were unnkillable (even with kill -9); probably hanging while waiting for disk-io (harddrive problems). Since you still could ssh into the nodes, and they worked normally unless you tried to do certain file system accesses, broctl apparently listed them as online, without giving any indication of problems with the nodes, besides the fact that "status" takes a long time.

Daniel Thayer
March 27, 2015, 9:39 PM

I'm not seeing a problem. As a test, I simulated a slow node by adding a "sleep"
command to one of the scripts that broctl runs on the remote host.
If the sleep is long enough to exceed the timeout, then I see "???" in the status
output (in the "Running", "Peers", and "Started" columns).
Otherwise, broctl status simply gathers information reported by Bro.

Robin Sommer
April 3, 2015, 6:31 PM

set timeout to 30s and make configurable, revisit later when Broker is there

Daniel Thayer
April 16, 2015, 9:29 PM

Branch topic/dnthayer/ticket1353 in the broctl repo contains the fix for this issue.

Robin Sommer
April 21, 2015, 2:25 AM

This has been merged already.




Johanna Amann



External issue ID



Fix versions

Affects versions