@RTGLWHow often are you pulling data from switch via SNMP?
1 minute is the shortest time interval we recommend, a little longer time span would be better.
That little CPU that handles these things like SNMP can get over loaded or slow down more important services it does for the switch core like LACP, RSTP, and all other services.
v1.5.22 Bug Reports and Comments
-
sirhc - Employee
- Posts: 7601
- Joined: Tue Apr 08, 2014 3:48 pm
- Location: Lancaster, PA
- Has thanked: 1673 times
- Been thanked: 1357 times
Re: v1.5.22 Bug Reports and Comments
Support is handled on the Forums not in Emails and PMs.
Before you ask a question use the Search function to see it has been answered before.
To do an Advanced Search click the magnifying glass in the Search Box.
To upload pictures click the Upload attachment link below the BLUE SUBMIT BUTTON.
Before you ask a question use the Search function to see it has been answered before.
To do an Advanced Search click the magnifying glass in the Search Box.
To upload pictures click the Upload attachment link below the BLUE SUBMIT BUTTON.
Re: v1.5.22 Bug Reports and Comments
We have updated several of our WS units to 1.5.22 and no issues to report with how they are operating.
One issue we have seen with the Netonix Manager - once updated to 1.5.22 we are unable to access the web UI of a switch from Netonix Manager. We select the switch and click the globe icon and we get a "403 Forbidden" message instead of getting the UI screen.
We were at version 1.5.16 prior to the update to 22 and the switches we still have at that version can still be accessed from Netonix Manager.
The status in Netonix Manager of switches at .22 still update - it is only accessing the UI that appears broken.
This issue is minor as we can go directly to the ip address for the switch and login - but if it is an easy fix it would be nice to have the convenient button again.
One issue we have seen with the Netonix Manager - once updated to 1.5.22 we are unable to access the web UI of a switch from Netonix Manager. We select the switch and click the globe icon and we get a "403 Forbidden" message instead of getting the UI screen.
We were at version 1.5.16 prior to the update to 22 and the switches we still have at that version can still be accessed from Netonix Manager.
The status in Netonix Manager of switches at .22 still update - it is only accessing the UI that appears broken.
This issue is minor as we can go directly to the ip address for the switch and login - but if it is an easy fix it would be nice to have the convenient button again.
- RTGLW
- Member
- Posts: 26
- Joined: Thu Jun 08, 2023 7:25 pm
- Location: New Zealand
- Has thanked: 25 times
- Been thanked: 14 times
Re: v1.5.22 Bug Reports and Comments
sirhc wrote:@RTGLWHow often are you pulling data from switch via SNMP?
1 minute is the shortest time interval we recommend, a little longer time span would be better.
That little CPU that handles these things like SNMP can get over loaded or slow down more important services it does for the switch core like LACP, RSTP, and all other services.
Similar to Dawizman, we poll switches twice every 60s on average as we poll from primary and redundant prometheus nodes every 60s individually. Granted this is twice as frequent as you've recommended, I can't say we've run into any issues with this setup previously. In saying that though, we run very light in terms of services enabled on our switches. E.g: No LACP/LAG, QoS, or discovery tab and otherwise only enable HTTPS, SSH, Syslog, and NTP. So other configuration's mileage may vary...
Additionally; When I performed a WALK on the switch from my MIB browser, it allowed our monitoring node to collect SNMP data from the point in time I ran the WALK. Like I could manually prompt SNMP collection. Example using board temps, hopefully visualizes the above better. This is a lab switch, so no traffic going across it but it's configured the same as our production switches.
As mentioned, disabling and re-enabling the SNMP server fixed our issue, as did a reboot. So far I've been unable to replicate the issue once resolved by either of those methods. Can PM syslogs from the above example switch upgrade if needed.
-
sirhc - Employee
- Posts: 7601
- Joined: Tue Apr 08, 2014 3:48 pm
- Location: Lancaster, PA
- Has thanked: 1673 times
- Been thanked: 1357 times
Re: v1.5.22 Bug Reports and Comments
RTGLW wrote:sirhc wrote:@RTGLWHow often are you pulling data from switch via SNMP?
1 minute is the shortest time interval we recommend, a little longer time span would be better.
That little CPU that handles these things like SNMP can get over loaded or slow down more important services it does for the switch core like LACP, RSTP, and all other services.
Similar to Dawizman, we poll switches twice every 60s on average as we poll from primary and redundant prometheus nodes every 60s individually. Granted this is twice as frequent as you've recommended, I can't say we've run into any issues with this setup previously. In saying that though, we run very light in terms of services enabled on our switches. E.g: No LACP/LAG, QoS, or discovery tab and otherwise only enable HTTPS, SSH, Syslog, and NTP. So other configuration's mileage may vary...
Additionally; When I performed a WALK on the switch from my MIB browser, it allowed our monitoring node to collect SNMP data from the point in time I ran the WALK. Like I could manually prompt SNMP collection. Example using board temps, hopefully visualizes the above better. This is a lab switch, so no traffic going across it but it's configured the same as our production switches.
As mentioned, disabling and re-enabling the SNMP server fixed our issue, as did a reboot. So far I've been unable to replicate the issue once resolved by either of those methods. Can PM syslogs from the above example switch upgrade if needed.
You know we never tested or even thought about the switches being polled by two snmp like servers. Wondering what happens if they both hit at same time?
And yes this is hitting it pretty often, more than we recommend obviously, worried about the small CPU in there which could be fine and something else.
As a test could you only query it from one and start at 2 minutes, then decrease to 1 min just for shits and giggles.
Not saying we won't investigate and or come up with a solution, but this would help.
Support is handled on the Forums not in Emails and PMs.
Before you ask a question use the Search function to see it has been answered before.
To do an Advanced Search click the magnifying glass in the Search Box.
To upload pictures click the Upload attachment link below the BLUE SUBMIT BUTTON.
Before you ask a question use the Search function to see it has been answered before.
To do an Advanced Search click the magnifying glass in the Search Box.
To upload pictures click the Upload attachment link below the BLUE SUBMIT BUTTON.
-
Stephen - Employee
- Posts: 1073
- Joined: Sun Dec 24, 2017 8:56 pm
- Has thanked: 99 times
- Been thanked: 202 times
Re: v1.5.22 Bug Reports and Comments
sakita wrote:I modified the time on my lab NTP server and then tried it...
/etc/init.d/ntp restart
stopped process in pidfile '/var/run/ntp' (pid 826)
...time was immediately set to match the NTP server. Cool.
After that I changed the NTP server time and tried the disable NTP/ save / enable NTP save and it did not change the Netonix time.
Again, it does set the time correctly on startup or reboot as long as the NTP server is up and accessible over the network (and it seems to try it enough times to work fine in general). However, when it needs to be done manually, resyncing is currently a command prompt activity (or wait 24 hours).
I checked 5 field switches that have been running 1.5.21 since last week and they all had accurate time and their logs showed "admin: sync time via ntp" log events once each day so that is working as intended as well.
This is what I've confirmed on my testing as well.
Working on a solution so that disable/enable cycle immediately triggers ntp without needing to invoke the script.
- oeyre
- Member
- Posts: 39
- Joined: Mon Feb 05, 2024 1:38 am
- Location: Australia
- Has thanked: 0 time
- Been thanked: 13 times
Re: v1.5.22 Bug Reports and Comments
We just found our first issue with traffic being eaten...
Model: WS-12-250-AC
Port: Port13
Port speed: 1G
SFP: FIBERSTORE SFP-10G-DAC (don't ask)
Normal IP traffic working fine, PPPoE not. Learning MACs in the affected VLAN from PoE/radio ports, not from the SFP port. Switch on the other side of the SFP not learning MACs from the affected VLAN on that port (other VLANs OK).
Tried the following on the Netonix without success:
-Remove and add ports from the VLAN
-Remove and add the entire VLAN
-Change the position of the VLAN in the list
Then I configured an IP address on the VLAN to communicate with my test IP upstream and this suddenly started the MAC learning, and the PPPoE sessions came up.
Model: WS-12-250-AC
Port: Port13
Port speed: 1G
SFP: FIBERSTORE SFP-10G-DAC (don't ask)
Normal IP traffic working fine, PPPoE not. Learning MACs in the affected VLAN from PoE/radio ports, not from the SFP port. Switch on the other side of the SFP not learning MACs from the affected VLAN on that port (other VLANs OK).
Tried the following on the Netonix without success:
-Remove and add ports from the VLAN
-Remove and add the entire VLAN
-Change the position of the VLAN in the list
Then I configured an IP address on the VLAN to communicate with my test IP upstream and this suddenly started the MAC learning, and the PPPoE sessions came up.
-
Stephen - Employee
- Posts: 1073
- Joined: Sun Dec 24, 2017 8:56 pm
- Has thanked: 99 times
- Been thanked: 202 times
Re: v1.5.22 Bug Reports and Comments
oeyre wrote:We just found our first issue with traffic being eaten...
Model: WS-12-250-AC
Port: Port13
Port speed: 1G
SFP: FIBERSTORE SFP-10G-DAC (don't ask)
Normal IP traffic working fine, PPPoE not. Learning MACs in the affected VLAN from PoE/radio ports, not from the SFP port. Switch on the other side of the SFP not learning MACs from the affected VLAN on that port (other VLANs OK).
Tried the following on the Netonix without success:
-Remove and add ports from the VLAN
-Remove and add the entire VLAN
-Change the position of the VLAN in the list
Then I configured an IP address on the VLAN to communicate with my test IP upstream and this suddenly started the MAC learning, and the PPPoE sessions came up.
We've been trying to hunt this one down for awhile, you may have just figured out the differentiating factor here.
Can anyone else who is either experiencing this issue or not chime in and let us know if you have an IP address assigned to the affected VLAN that carries the PPPoE traffic?
Also, did any of your other VLANs that weren't exhibiting the problem have IP's assigned too them?
One more question to assist with one of my own working theories, is DHCP Snooping enabled on any of your ports?
-
yahel - Member
- Posts: 99
- Joined: Wed May 27, 2015 12:07 am
- Location: Berkeley, CA
- Has thanked: 22 times
- Been thanked: 22 times
Re: v1.5.22 Bug Reports and Comments
Yes - we have the same problem -- for us it kills OSPF multicasts (probably all multicasts - but that's what easy to notice for us).
Only on SFP ports with tagged VLANs - trunks (but not in all cases, or not always, can't figure out when).
(I suspect it might also have something to do with LAG - in most cases these are LAG members).
We do have IPs for VLANs (AKA "watchdog IPs").
Also - seen the NTP issue.
/etc/init.d/ntp restart solves the problem (but not via the GUI).
Only on SFP ports with tagged VLANs - trunks (but not in all cases, or not always, can't figure out when).
(I suspect it might also have something to do with LAG - in most cases these are LAG members).
We do have IPs for VLANs (AKA "watchdog IPs").
Also - seen the NTP issue.
/etc/init.d/ntp restart solves the problem (but not via the GUI).
-
Stephen - Employee
- Posts: 1073
- Joined: Sun Dec 24, 2017 8:56 pm
- Has thanked: 99 times
- Been thanked: 202 times
Re: v1.5.22 Bug Reports and Comments
Hi yahel,
By any chance, are the trunked VLANs that are on the SFP ports using an assigned IP?
If it is, if you remove the IP from that VLAN and add it back does that make a difference?
By any chance, are the trunked VLANs that are on the SFP ports using an assigned IP?
If it is, if you remove the IP from that VLAN and add it back does that make a difference?
-
yahel - Member
- Posts: 99
- Joined: Wed May 27, 2015 12:07 am
- Location: Berkeley, CA
- Has thanked: 22 times
- Been thanked: 22 times
Re: v1.5.22 Bug Reports and Comments
Yes - the trunked VLANs are using an assigned IP (watchdog IP).
We currently have the Interface (P14) that is giving us hard time disabled (it's a LAG member with two other members - one is P13-SFP and the other P12-RJ45).
With it disabled, everything works fine -- when we enable it, the OSPF dies.
I'll ask Vivek from our team in India to temporarily disable the IPs on the VLANs tonight after 2am, and he'll see if that helps (I'll be asleep).
If it does make things work with P14 enabled, he'll try to re-enable the IPs, one by one, to see if that can teach us anything.
Thanks!
We currently have the Interface (P14) that is giving us hard time disabled (it's a LAG member with two other members - one is P13-SFP and the other P12-RJ45).
With it disabled, everything works fine -- when we enable it, the OSPF dies.
I'll ask Vivek from our team in India to temporarily disable the IPs on the VLANs tonight after 2am, and he'll see if that helps (I'll be asleep).
If it does make things work with P14 enabled, he'll try to re-enable the IPs, one by one, to see if that can teach us anything.
Thanks!
Who is online
Users browsing this forum: No registered users and 16 guests