Page 1 of 1

Trying to monitor fan failure

Posted: Tue Sep 07, 2021 11:12 pm
by jpmoller
Hi team,

We have recently had a Netonix switch that had a fan missing when taken out of the box (RMA was fine). But as part of that discussion, we are looking to implement some monitoring when a fan has failed.

I can see that there are OIDs for monitoring the speed of the fan(s), but cannot see anything for the state of the fans. If there is an error, does the speed OID give a negative value? Or are there some traps for these type of things?
I can see in the web UI that you can test the fans, so is the results from the diagnostics able to be extracted at all?



Thanks,
JP

Re: Trying to monitor fan failure

Posted: Wed Sep 08, 2021 4:52 pm
by jpmoller
From what I can find in the forums, the switches do not have traps, so I guess that isn't going to work.

Is there a way for me to run the "Run Fan Test" from the CLI and interpret the results from the file "www/fan_test_results"? . I'm not sure what this means and there are 4 entries that say "LEG" in the file, but only 3 fans in the switch. Presumably one is an overall pass/fail?

Re: Trying to monitor fan failure

Posted: Wed Sep 08, 2021 5:17 pm
by Stephen
I think there is a misunderstanding about what fan test actually does for a Netonix switch.

This link goes into a bit of detail:

To summarize the link though, awhile back the fans we were using became obsolete and we could no longer acquire them.
The manufacturer provided a alleged drop-in replacement fan that did not work with the same settings used previously.
So we had to implement a test in the firmware to check which type of fan was being used (named: "LEG" for legacy, or "NEW") and change the settings accordingly.

The test in the UI is for if you get replacement fans, you actually don't have to run it as when the switch tries to run the fans, if it fails, it will eventually automatically re-run the test to determine if it's using the right settings.

Also yes the fourth (or technically first) value in the file is the overall results (options are "LEG" "NEW" "MIX", or "ERR")

To monitor the fan state's there are really 2 options: one would be to setup a remote syslog server and configure the switch to forward logs too it. Anytime an error occurs with the fans it will be logged in syslog so you could potentially monitor it that way.

The other (probably easier) method would be to use the Netonix Manager and monitor your switch with it. Whenever an important error occurs on a Netonix Switch it is "highlighted" in the logs. The Netonix Manager is able to detect this and the switch's indicator will be yellow, if you click on it you can then see the specific log (which would include a fan error) for that switch.

I hope that's helpful.

Re: Trying to monitor fan failure

Posted: Wed Oct 06, 2021 3:20 pm
by jpmoller
Thanks Stephen, response and potential solutions are much appreciated!