Netonix Forums

Posted: **Fri Oct 04, 2019 2:06 pm**

mayheart,

Is this result on both the IDC and the AC?

I'm having a high degree of difficulty replicating the problem now with that release.

If possible, you mentioned before that you might be able to give me access to the switch - is it possible I could get access to a computer that has putty and an rs232 cable plugged into the switch with this firmware loaded?

Or if not, if you're willing, I can tell you what commands I need to run to see how vtss_appl is crashing to get a better idea of what is different between our system's.

Posted: **Fri Oct 04, 2019 8:49 pm**

The switch was taken down to do the voltage level adjustment, since then the vtss_appl messages have not returned. Switch has been running for about 9 hours without any packet loss. Is it possible that this was causing the problems? or possibly having the switch powered down for a few minutes? It was only quickly powered cycle in the past. Been running 1.5.5rc3-201910040216

I'll toss the latest build you have for me onto the IDC switch, I've not tried that one since the original beta you sent me, I'll report back.

I did notice the memory usage is pretty high sitting at 116 megs out of 128, not sure if that's you running additional debugging. I'll keep it running in case you need some information off it.

Code: Select all: 811 root 20 0 89524 81m 988 S 2.3 66.2 16:04.09 status_thread 743 root 20 0 89568 81m 988 S 0.0 66.2 0:00.75 erps 744 root 20 0 89568 81m 988 R 0.0 66.2 0:00.40 mstp_thread 758 root 20 0 89580 81m 988 S 0.0 66.2 0:00.00 vtss_appl 759 root 20 0 89580 81m 988 S 0.0 66.2 0:55.22 vtss_appl

syslog:

Code: Select all: Dec 31 19:00:06 netonix: 1.5.5rc3-201910040216 on WS-12-250-AC Dec 31 19:00:10 system: Setting MAC address from flash configuration: EC:13:B2:64:09:0E Dec 31 19:00:12 system: starting ntpclient Dec 31 19:00:13 root: adding lan (eth0.997) to firewall zone lan Dec 31 19:00:26 dropbear[764]: Running in background Dec 31 19:00:28 switch[790]: temp sensor version 3 Dec 31 19:00:28 switch[791]: Detected cold boot Dec 31 19:00:33 system: starting ntpclient Oct 4 10:30:02 system: time set by NTP server

Posted: **Fri Oct 04, 2019 9:05 pm**

Update on the IDC switch:

I get vtss_appl restarting over and over with the first beta build, the latest one is fine. Switch boots up without any issues.

Code: Select all: Jan 1 00:00:09 netonix: 1.5.5rc3-201910040216 on WS-26-400-IDC Jan 1 00:00:15 system: Setting MAC address from flash configuration: EC:13:B2:11:6F:9E Jan 1 00:00:18 root: adding lan (eth0) to firewall zone lan Dec 31 19:00:34 root: removing lan (eth0.997) from firewall zone lan Dec 31 19:00:39 root: adding lan (eth0.997) to firewall zone lan Dec 31 19:00:48 root: adding lan (eth0.997) to firewall zone lan Dec 31 19:00:50 system: starting ntpclient Oct 4 21:03:25 dropbear[1386]: Running in background Oct 4 21:03:28 switch[1416]: temp sensor version 3 Oct 4 21:03:28 switch[1417]: Detected warm boot Oct 4 21:03:29 Port: link state changed to 'up' (1G) on port 25

Posted: **Fri Oct 04, 2019 9:12 pm**

mayheart wrote:The switch was taken down to do the voltage level adjustment, since then the vtss_appl messages have not returned. Switch has been running for about 6 hours without any packet loss. Is it possible that this was causing the problems?

Yes actually, I've noticed in some of my testing that having an unstable power source on one of the line's (most commonly the 3.3V) can cause kernel error's to be thrown which can destabilize vtss_appl. Powering down the unit I've noticed does help stabalize it as well, though the amount time it's down is irrelevant for the software perspective (unless maybe there is something wonky with the power cap's, allowing them to fully discharge might help with certain flagging or something - it might be best to check to see if your unit is afflicted with the issue sirhc brought up earlier to eliminate this as a possibility)

mayheart wrote:I did notice the memory usage is pretty high sitting at 116 megs out of 128, not sure if that's you running additional debugging.

Actually I noticed that earlier on one of my test unit's in my switch farm and I already corrected the issue (4 hours nothing has gone past 52MB). I will send you a link to the latest one I have. It's starting to sound like our system's are nearly on the same page.

EDIT (just saw your new post on the IDC switch):

That's great! I thought I was totally off base for a bit there.
Let me know if you want me to send you the next release with the mem leak fix.

Posted: **Fri Oct 04, 2019 9:23 pm**

Sure, send me the latest build, I'll toss it onto both AC and IDC switches and let it sit for the weekend.

Thanks for all your hard work getting this resolved.

Posted: **Fri Oct 04, 2019 9:27 pm**

You're welcome!

I'll send you the latest shortly.

Posted: **Sat Oct 05, 2019 3:43 am**

Ludvik wrote:It is not bug 1.5.5, I've noticed that before. After upgrade snmp does not return right version number.

#snmpget -v 2c -c public curve1.vinarice .1.3.6.1.4.1.46242.1.0
SNMPv2-SMI::enterprises.46242.1.0 = STRING: "Unknown"

I don't know what operations solve it. Reboot, saving configuration, trying snmpwalk (oid .1.3.6.1.4.1.46242), or only time ...

I upgrade 5 switches to 1.5.5rc2, one is OK, four is "unknown"

Hey Ludvick, I had a chance to test this behavior and I was able to replicate it and I found the cause. What's happening is that net-snmpd has a cache table that it builds for a few of the OID's that we get from the switch and it fails to load unless certain value's are polled first which seem's to trigger the cache to load for our custom OID's. For example, running:

Code: Select all: snmpget -v 2c <switch_ip> -c public .1.3.6.1.2.1.105.1.3.1.1.4.8

For me triggered the cache to load 100% of the time for these specific cached OID's in our MIB.

I'm looking into a way to trigger the cache loading mechanism when the net-snmpd daemon is launched. It should be possible, but in the mean time that should be a useable workaround.

Btw if that doesn't work, just running snmpwalk on the community worked for me too:

Code: Select all: snmpwalk -v 2c <switch_ip> -c public

Posted: **Sat Oct 05, 2019 5:59 am**

Probably yes, snmpwalk is helping. In my management system I run "snmpwalk SNMPv2-SMI::enterprises.46242" if is version number "unknown" and it seems working too. It's been a few days since I test it.

Posted: **Tue Oct 08, 2019 5:26 pm**

mayheart,

Haven't heard from you today. But I did find in my own testing that there was one more memory leak. I fixed that one as well and will send you another one to upload. If you aren't having any issue's than I don't see a problem. But just in case it is an option. Either way it seem's like this issue has been resolved so I am working on some other things that have come up now.

Posted: **Tue Oct 08, 2019 5:52 pm**

I've installed the latest image you sent me, looks good so far.

All my problems seem resolved.

Netonix Forums

v1.5.5rcX Bug Reports and Comments

Re: v1.5.5rcX Bug Reports and Comments

Re: v1.5.5rcX Bug Reports and Comments

Re: v1.5.5rcX Bug Reports and Comments

Re: v1.5.5rcX Bug Reports and Comments

Re: v1.5.5rcX Bug Reports and Comments

Re: v1.5.5rcX Bug Reports and Comments

Re: v1.5.5rcX Bug Reports and Comments

Re: v1.5.5rcX Bug Reports and Comments

Re: v1.5.5rcX Bug Reports and Comments

Re: v1.5.5rcX Bug Reports and Comments