v1.5.22 Bug Reports and Comments

DOWNLOAD THE LATEST FIRMWARE HERE
User avatar
Dawizman
Experienced Member
 
Posts: 160
Joined: Fri Jul 03, 2015 4:11 pm
Location: Cold Lake, AB - CANADA
Has thanked: 17 times
Been thanked: 26 times

Re: v1.5.22 Bug Reports and Comments

Tue Dec 10, 2024 7:14 pm

Stephen wrote:
Dawizman wrote:...

Code: Select all
admin@OpenWrt:/www# ps aux |grep snmp
 907 admin 5324 S /usr/sbin/snmpd -Lf /dev/null -p /var/run/snmpd.pid
 1637 admin 2680 S grep snmp


I also did an SNMP walk just to confirm, the counters have stopped. These were run ~ 5 minutes apart
Code: Select all
C:\Users\Justin>C:\usr\bin\snmpbulkwalk.exe -v 2c -c <hidden> 10.21.2.10 .1.3.6.1.2.1.31.1.1.1.6
IF-MIB::ifHCInOctets.1 = Counter64: 13836532
IF-MIB::ifHCInOctets.2 = Counter64: 3887397789
IF-MIB::ifHCInOctets.3 = Counter64: 0
IF-MIB::ifHCInOctets.4 = Counter64: 0
IF-MIB::ifHCInOctets.5 = Counter64: 209658652
IF-MIB::ifHCInOctets.6 = Counter64: 102408443
IF-MIB::ifHCInOctets.7 = Counter64: 94443216
IF-MIB::ifHCInOctets.8 = Counter64: 102612383
IF-MIB::ifHCInOctets.9 = Counter64: 103104744
IF-MIB::ifHCInOctets.10 = Counter64: 72973599
IF-MIB::ifHCInOctets.11 = Counter64: 0
IF-MIB::ifHCInOctets.12 = Counter64: 0
IF-MIB::ifHCInOctets.13 = Counter64: 0
IF-MIB::ifHCInOctets.14 = Counter64: 0

C:\Users\Justin>C:\usr\bin\snmpbulkwalk.exe -v 2c -c <hidden> 10.21.2.10 .1.3.6.1.2.1.31.1.1.1.6
IF-MIB::ifHCInOctets.1 = Counter64: 13836532
IF-MIB::ifHCInOctets.2 = Counter64: 3887397789
IF-MIB::ifHCInOctets.3 = Counter64: 0
IF-MIB::ifHCInOctets.4 = Counter64: 0
IF-MIB::ifHCInOctets.5 = Counter64: 209658652
IF-MIB::ifHCInOctets.6 = Counter64: 102408443
IF-MIB::ifHCInOctets.7 = Counter64: 94443216
IF-MIB::ifHCInOctets.8 = Counter64: 102612383
IF-MIB::ifHCInOctets.9 = Counter64: 103104744
IF-MIB::ifHCInOctets.10 = Counter64: 72973599
IF-MIB::ifHCInOctets.11 = Counter64: 0
IF-MIB::ifHCInOctets.12 = Counter64: 0
IF-MIB::ifHCInOctets.13 = Counter64: 0


Throughput screenshots run about 20 minutes apart:
First: https://imgur.com/a/lHsFZzs
Second: https://imgur.com/a/FmjYxEt

Current uptime is ~4 days 1 hour. I would ballpark the counters stopped at almost exactly 3 days of uptime.



I currently have a test running that is at about 20 hours uptime. Only difference is that I'm querying 1.3.6.1.2.1.2.2.1.10 aka IF-MIB::ifInOctets

Would you mind also checking this OID to make sure it's counters have stopped as well?

Also how often are you querying the system?



Can confirm that OID is also stalled.

Code: Select all
C:\Users\Justin>C:\usr\bin\snmpbulkwalk.exe -v 2c -c <hidden> 10.21.2.10 1.3.6.1.2.1.2.2.1.10
IF-MIB::ifInOctets.1 = Counter32: 13836532
IF-MIB::ifInOctets.2 = Counter32: 3887397789
IF-MIB::ifInOctets.3 = Counter32: 0
IF-MIB::ifInOctets.4 = Counter32: 0
IF-MIB::ifInOctets.5 = Counter32: 209658652
IF-MIB::ifInOctets.6 = Counter32: 102408443
IF-MIB::ifInOctets.7 = Counter32: 94443216
IF-MIB::ifInOctets.8 = Counter32: 102612383
IF-MIB::ifInOctets.9 = Counter32: 103104744
IF-MIB::ifInOctets.10 = Counter32: 72973599
IF-MIB::ifInOctets.11 = Counter32: 0
IF-MIB::ifInOctets.12 = Counter32: 0
IF-MIB::ifInOctets.13 = Counter32: 0
IF-MIB::ifInOctets.14 = Counter32: 0


We use PRTG to monitor our infrastructure. This switch is being polled every 30 seconds (as is every one of the 200+ other netonix on our network). It seems other OIDs have stopped updating as well, since the same time yesterday (I simply didn't notice). Power and temperature related OIDs are not updating, so far only system uptime seems to still be responsive.

A list of other OID's we monitor, all of which are reporting the same values since the interface counters stopped working;
1.3.6.1.4.1.46242.3.1.3.2
1.3.6.1.4.1.46242.3.1.3.1
1.3.6.1.4.1.46242.3.1.3.3
1.3.6.1.4.1.46242.3.1.3.4
1.3.6.1.4.1.46242.3.1.3.5
1.3.6.1.4.1.46242.3.1.3.6
1.3.6.1.4.1.46242.3.1.3.7
1.3.6.1.4.1.46242.2.1.2.1
1.3.6.1.4.1.46242.4.1.3.4
1.3.6.1.4.1.46242.4.1.3.5
1.3.6.1.4.1.46242.7.0
1.3.6.1.4.1.46242.8
1.3.6.1.4.1.46242.4.1.3.1
1.3.6.1.4.1.46242.4.1.3.2
1.3.6.1.4.1.46242.4.1.3.3
1.3.6.1.4.1.46242.6.0

User avatar
Stephen
Employee
Employee
 
Posts: 1073
Joined: Sun Dec 24, 2017 8:56 pm
Has thanked: 99 times
Been thanked: 202 times

Re: v1.5.22 Bug Reports and Comments

Tue Dec 10, 2024 11:09 pm

Dawizman wrote:We use PRTG to monitor our infrastructure. This switch is being polled every 30 seconds (as is every one of the 200+ other netonix on our network). It seems other OIDs have stopped updating as well, since the same time yesterday (I simply didn't notice). Power and temperature related OIDs are not updating, so far only system uptime seems to still be responsive.

A list of other OID's we monitor, all of which are reporting the same values since the interface counters stopped working;
1.3.6.1.4.1.46242.3.1.3.2
1.3.6.1.4.1.46242.3.1.3.1
1.3.6.1.4.1.46242.3.1.3.3
1.3.6.1.4.1.46242.3.1.3.4
1.3.6.1.4.1.46242.3.1.3.5
1.3.6.1.4.1.46242.3.1.3.6
1.3.6.1.4.1.46242.3.1.3.7
1.3.6.1.4.1.46242.2.1.2.1
1.3.6.1.4.1.46242.4.1.3.4
1.3.6.1.4.1.46242.4.1.3.5
1.3.6.1.4.1.46242.7.0
1.3.6.1.4.1.46242.8
1.3.6.1.4.1.46242.4.1.3.1
1.3.6.1.4.1.46242.4.1.3.2
1.3.6.1.4.1.46242.4.1.3.3
1.3.6.1.4.1.46242.6.0


For the time being I would try to disable and re-enable snmp on that switch, if it still isn't starting back up after that you may need to reboot the switch to get it back.

However, we recommend avoiding frequent SNMP queries, as this has historically led to issues. Usually, we say that one query a minute should be considered maximum rate.

On our end I have a test that is scheduled to continue for 5 days that is querying these OID's. We will see if that can replicate what you're seeing.

In the mean time, if you could reset snmp on the switch and slow down the queries a bit - just watch and let us know if it happens again.

coreinput
Member
 
Posts: 17
Joined: Tue Dec 27, 2016 1:59 pm
Has thanked: 2 times
Been thanked: 14 times

Re: v1.5.22 Bug Reports and Comments

Wed Dec 11, 2024 9:56 am

Updated to 1.5.22 no POE loss but did have 100% CPU reported on the vtss_appl process from top. Ended up disabling and re-enabling IGMP Snooping which brought the CPU back down to normal range. Thanks for all the hard work that went into this release!

o2theo
Member
 
Posts: 3
Joined: Wed Jul 22, 2020 5:12 am
Has thanked: 1 time
Been thanked: 0 time

Re: v1.5.22 Bug Reports and Comments

Wed Dec 11, 2024 10:29 am

Updated one of our WS-24-400-AC from v1.5.8 to v1.5.22 (in a test setting). Most of our devices (75 out of ~100) are still running v.1.5.8.

After this update the log in to the web GUI was no longer possible with the known credentials. Log in via CLI was still possible with the same credentials. Just updated the same password via CLI and now the log in on web GUI is possible again. Further tests are pending.

User avatar
sakita
Experienced Member
 
Posts: 218
Joined: Mon Aug 17, 2015 2:44 pm
Location: Arizona, USA
Has thanked: 105 times
Been thanked: 86 times

Re: v1.5.22 Bug Reports and Comments

Wed Dec 11, 2024 11:12 am

sirhc wrote:
JeffreyS wrote:
KeesH wrote:I noticed two issues.
Some switches - not all - lost the ability to connect to our NTP server. The switch had no problems pinging the NTP server.
Found issue on both WS-6 mini and WS-12-250-DC "Time not set"


I had this on a WS-8-150-AC that I deployed last week running on 1.5.21 and initially after upgrading yesterday morning to 1.5.22 it did the same. Now, the time is in sync.

I had another switch take a bit for the time to sync with the NTP server after boot after upgrading to 1.5.22. I think it may be that I had a time delay set on the uplink device to power up. I since removed this from the uplink devices and the time sync during boot is more reliable. This used to not be the the case from my experience. *shrug*


The old firmware would keep trying indefinitely, which was not a great idea.

We limited it to 10 times. After that it will try every 24 hours.

You can always force it to try and sync by disabling save enabling save NTP


WS-8-150-AC in lab does NTP sync on bootup. Excellent! However, Disable NTP / Save / Enable NTP / Save does not work (see log screenshot). The NTP server time was about 0730 on Dec/11/2024 when I tested this. Note that I even tried changing to a different IP address to see if that would make a difference (same router with multiple addresses). In both cases the switch logs that it did NTP sync but the time didn't change.

netonix ws-8-150-ac ntp 1.5.22.png
ntp not really resyncing


The lab setup uses a MikroTik hEX lite as an NTP server. It doesn't have an RTC so it boots up with the wrong time which the Netonix dutifully syncs to (as it should). Once I correct the time in the MikroTik I either reboot the switch or issue a manual sync from the switch's command prompt.
Today is an average day: Worse than yesterday, but better than tomorrow.

User avatar
Stephen
Employee
Employee
 
Posts: 1073
Joined: Sun Dec 24, 2017 8:56 pm
Has thanked: 99 times
Been thanked: 202 times

Re: v1.5.22 Bug Reports and Comments

Wed Dec 11, 2024 3:04 pm

sakita wrote:...
WS-8-150-AC in lab does NTP sync on bootup. Excellent! However, Disable NTP / Save / Enable NTP / Save does not work (see log screenshot). The NTP server time was about 0730 on Dec/11/2024 when I tested this. Note that I even tried changing to a different IP address to see if that would make a difference (same router with multiple addresses). In both cases the switch logs that it did NTP sync but the time didn't change.

netonix ws-8-150-ac ntp 1.5.22.png


The lab setup uses a MikroTik hEX lite as an NTP server. It doesn't have an RTC so it boots up with the wrong time which the Netonix dutifully syncs to (as it should). Once I correct the time in the MikroTik I either reboot the switch or issue a manual sync from the switch's command prompt.



Hi sakita, could you try running the following command on the offending switch?

Code: Select all
cmd
/etc/init.d/ntp restart


Let me know if a switch that is not syncing is able to do so after that.

User avatar
sirhc
Employee
Employee
 
Posts: 7601
Joined: Tue Apr 08, 2014 3:48 pm
Location: Lancaster, PA
Has thanked: 1673 times
Been thanked: 1357 times

Re: v1.5.22 Bug Reports and Comments

Wed Dec 11, 2024 4:55 pm

What confuses me on the NTP issue is that all my switches and most people's switches are getting time.

Has to be something different either the NTP server, switch configuration or something that is differnt?
Support is handled on the Forums not in Emails and PMs.
Before you ask a question use the Search function to see it has been answered before.
To do an Advanced Search click the magnifying glass in the Search Box.
To upload pictures click the Upload attachment link below the BLUE SUBMIT BUTTON.

User avatar
sakita
Experienced Member
 
Posts: 218
Joined: Mon Aug 17, 2015 2:44 pm
Location: Arizona, USA
Has thanked: 105 times
Been thanked: 86 times

Re: v1.5.22 Bug Reports and Comments

Wed Dec 11, 2024 5:34 pm

I modified the time on my lab NTP server and then tried it...

/etc/init.d/ntp restart
stopped process in pidfile '/var/run/ntp' (pid 826)

...time was immediately set to match the NTP server. Cool.

After that I changed the NTP server time and tried the disable NTP/ save / enable NTP save and it did not change the Netonix time.

Again, it does set the time correctly on startup or reboot as long as the NTP server is up and accessible over the network (and it seems to try it enough times to work fine in general). However, when it needs to be done manually, resyncing is currently a command prompt activity (or wait 24 hours).

I checked 5 field switches that have been running 1.5.21 since last week and they all had accurate time and their logs showed "admin: sync time via ntp" log events once each day so that is working as intended as well.
Today is an average day: Worse than yesterday, but better than tomorrow.

oeyre
Member
 
Posts: 39
Joined: Mon Feb 05, 2024 1:38 am
Location: Australia
Has thanked: 0 time
Been thanked: 13 times

Re: v1.5.22 Bug Reports and Comments

Wed Dec 11, 2024 5:45 pm

155 units upgraded to 1.5.22, I'm not aware of any major issues with the upgrade process itself besides one radio not coping with its power being dropped. Happy to report the unit that was having 100% CPU has resolved. Too early to say for sure whether the memory leak/rebooting has stopped.

Will keep an eye for any customer complaints, most are PPPoE with some manually configured IP, no DHCP.

Edit: Just realised that we were having one instance (that we know of so far) of the stuck SNMP data problem that others were reporting. Disable/enable SNMP via web seems to have fixed that.
Last edited by oeyre on Wed Dec 11, 2024 6:25 pm, edited 2 times in total.

RTGLW
Member
 
Posts: 26
Joined: Thu Jun 08, 2023 7:25 pm
Location: New Zealand
Has thanked: 25 times
Been thanked: 14 times

Re: v1.5.22 Bug Reports and Comments

Wed Dec 11, 2024 5:50 pm

Stephen wrote:For the time being I would try to disable and re-enable snmp on that switch, if it still isn't starting back up after that you may need to reboot the switch to get it back.

However, we recommend avoiding frequent SNMP queries, as this has historically led to issues. Usually, we say that one query a minute should be considered maximum rate.

On our end I have a test that is scheduled to continue for 5 days that is querying these OID's. We will see if that can replicate what you're seeing.

In the mean time, if you could reset snmp on the switch and slow down the queries a bit - just watch and let us know if it happens again.


We're still in the process of bench testing 1.5.22 on our switches and I'm planning on collating and posting our observations from this later in the week, but wanted to add that I experienced the same exact SNMP issue described by Dawizman on two hosts post upgrade. Disabling and re-enabling the SNMP server on the switch fixed the issue, but a reboot will also resolve it.

Will also report that I've not seen an issue with NTP yet and our FS SFP modules (SFP-GB-GE-T) no longer need reseating when upgrading to 1.5.22.

PSA: A segmentation fault presented itself when attempting to SSH to a default config switch I had downgraded to 1.5.15rc3 from 1.5.22 (not entirely surprised). It was still accessible on GUI. I resolved it with the "hold reset while powering on" factory reset method. So, while not specifically a 1.5.22 bug, definitely a word of caution to those jumping backwards in FW to solve problems they've encountered on the newer versions.

PreviousNext
Return to Hardware and software issues

Who is online

Users browsing this forum: Google [Bot] and 25 guests