Dropping ports on new WS, what is wrong with my setup?

DOWNLOAD THE LATEST FIRMWARE HERE
User avatar
sirhc
Employee
Employee
 
Posts: 7347
Joined: Tue Apr 08, 2014 3:48 pm
Location: Lancaster, PA
Has thanked: 1597 times
Been thanked: 1318 times

Re: Dropping ports on new WS, what is wrong with my setup?

Tue Apr 26, 2016 10:39 am

Adair I am taking it seriously but how would you suggest we figure it out?

If the switch is running fine now but locks up at some undetermined time if we look at it right now all is good and there is no way to determine anything from a properly functioning switch.

I have suggested the following:
First I would upgrade to v1.4.0rc12 and see if it happens again as there were changes that could affect large flat segments and fixes to UBNT Discovery and MAC tables.

If that does not help swap out the switch with a different one and see if the problem persists as this would indicate if it is a bad unit if the replacement unit does not do it.

If it still locks up after swapping then we need to look at your config, look at grounding, look at power and try and narrow it down.

I do not have any magic way to figure it out.

I have also explained that the only time I have ever run across a switch that locks up TIGHT was due to a power and grounding issue which I linked a post from last year above.

I am sure that a normal user would have blamed the switch with my problem but in the end it was grounding and a wire running between buildings/electric services.

I simply said I do not think your assessment of bandwidth having anything to do with it was probably not a factor.

You say you have 40 switches in service, are they ALL doing this? If not then it has to be something environmental with these 2 location or something differnt in the network configuration / type of traffic but I doubt this. OR a bad unit which can be tested by swapping it out.

When this happens have you attempted to use your console cable to verify it is a hard lock? If it is a hard lock the only thing that can cause that is an electrical issue or a bad unit. If you swap it out and it persists then we know it is not a bad unit.

Sometimes being a network technician is like being a detective.

But lets look at what we do know:
We know the switches can pass many GB of traffic so it is not a capacity issue.

There are 12K switches out there and a "couple" people having a "similar" problem, if this was a firmware issue or hardware design flaw everyone would have an issue not a couple people.

You yourself have 40+ switches yet only a problem with 2 locations.

We would love to help and or fix this but we need to narrow it down.
Support is handled on the Forums not in Emails and PMs.
Before you ask a question use the Search function to see it has been answered before.
To do an Advanced Search click the magnifying glass in the Search Box.
To upload pictures click the Upload attachment link below the BLUE SUBMIT BUTTON.

User avatar
sirhc
Employee
Employee
 
Posts: 7347
Joined: Tue Apr 08, 2014 3:48 pm
Location: Lancaster, PA
Has thanked: 1597 times
Been thanked: 1318 times

Re: Dropping ports on new WS, what is wrong with my setup?

Tue Apr 26, 2016 10:54 am

Another thing I have seen in the past is a corrupt config file that causes a switch to lock up. In fact I just ran into this problem this past weekend.

I have (2) WS-24-400A units in my NOC which I upgrade all the time to test RC code. Apparently during one of the upgrades the 2nd switch got a corrupted config.

Took me all morning Sunday to figure it out. Whatever it was doing also brought the network down in my office/NOC which I assume had something to so with the fact that the 2 switches have an LACP LAG between them and a STATIC LAG to the router and a loop was forming???

If I rebooted the one switch with the corrupted config the network would come up but would eventually lock up. After many hours of investigating I had noticed that the switch was not fully configured (The LACP LAG would not go active) and RSTP was preventing the loop. Now what would cause it to lock up hours later I have no idea. But ultimately I noticed that if I attempted any change at all to the switch with the corrupted config even some minor thing like a Port Description change the network would crash and I could no longer get into the switch UI/CLI.

I did a factory default on the switch in question where I hold in the default button for 20 seconds while powering it up which you then have to let set for several minutes as it re-formats part of the flash chip, then I manually set the switch back up and the problem went away.

Now what caused the config to get corrupted I have no idea, shit happens sometimes with any piece of electronic equipment so unless I see a pattern or many reports of this I will chalk it up to bad luck. So that is another possible thing to try, factory default and manually set back up. But this is not a unique tech procedure to Netonix equipment as I have had to do that in the past to all sorts of computers and network equipment. DO NOT EXPORT AND IMPORT AS CONFIG MAY BE CORRUPTED?
Support is handled on the Forums not in Emails and PMs.
Before you ask a question use the Search function to see it has been answered before.
To do an Advanced Search click the magnifying glass in the Search Box.
To upload pictures click the Upload attachment link below the BLUE SUBMIT BUTTON.

User avatar
sirhc
Employee
Employee
 
Posts: 7347
Joined: Tue Apr 08, 2014 3:48 pm
Location: Lancaster, PA
Has thanked: 1597 times
Been thanked: 1318 times

Re: Dropping ports on new WS, what is wrong with my setup?

Tue Apr 26, 2016 12:59 pm

Please post up a screencap of your Main Status Tab Adair.
Support is handled on the Forums not in Emails and PMs.
Before you ask a question use the Search function to see it has been answered before.
To do an Advanced Search click the magnifying glass in the Search Box.
To upload pictures click the Upload attachment link below the BLUE SUBMIT BUTTON.

User avatar
TheHox
Experienced Member
 
Posts: 107
Joined: Sat Sep 13, 2014 10:59 am
Location: WI
Has thanked: 11 times
Been thanked: 18 times

Re: Dropping ports on new WS, what is wrong with my setup?

Tue Apr 26, 2016 2:11 pm

My issues started when we went to an MDU that we were upgrading the WiFI and using Netonix switches to power the APs.
We plugged 2 netonix's into a Netgear ProSafe that was onsite, and used the netonix to power 18 UniFi APs. Everything was working fine, then we left that afternoon and about an hour later got a call the internet was not working. I was not able to get to the switches, and the Unifi controller showed all my AP's offline.

I sent a tech to power cycle the Netonix's and all was good again.

Another 1.5 hr later it happened again, I had a tech again reboot, and at that time had him plug a patch cable directly from one of the switches to the router to a port I disabled remotely(to attempt to see what was going on later)

Like clock work, another hour later, it happened a 3rd time, I was able to login to the router and enable that port we just plugged the patch cable to, I was able to get to the switch to see the uplink port flapping on/off, which is the log file I made on the original post. I went onsite and I moved the 2 netonix's directly to the router and bypassed the ProSafe and those issues seemed to go away.


After that, we then noticed other issues,
A 2nd issue I had, was, the Netonix's also power some 5 port switches in each of the 18 units in the MDU, some of them were doing a constant stream of data, like 16kpps solid Ports 2, 6 and 8 in the attached image show the flood. A power cycled fixed it. We have since vlan'd each of the units off.
We have had the NetGear Prosafe running in this MDU for over a year just fine, but the Netonix had some issues at first getting going, no loops as we didn't change any of the wiring just swapped out switches.

Running 1.3.9, we have about 20 switches across our wisp that usually are fine, but something really weird going on here.
Attachments
switchflood.png

User avatar
sirhc
Employee
Employee
 
Posts: 7347
Joined: Tue Apr 08, 2014 3:48 pm
Location: Lancaster, PA
Has thanked: 1597 times
Been thanked: 1318 times

Re: Dropping ports on new WS, what is wrong with my setup?

Tue Apr 26, 2016 2:41 pm

TheHox wrote:I moved the 2 netonix's directly to the router and bypassed the ProSafe and those issues seemed to go away.

Not sure what to say about this? Is the problem the Netonix or the ProSafe or they just do not like playing together? If you're really curious and you can cause it to happen like clockwork every hour being on site with a console cable if needed and examining the switch logs would be a good start. I would also try this configuration with v1.4.0rc12 for reasons stated.

TheHox wrote:After that, we then noticed other issues,
A 2nd issue I had, was, the Netonix's also power some 5 port switches in each of the 18 units in the MDU, some of them were doing a constant stream of data, like 16kpps solid Ports 2, 6 and 8 in the attached image show the flood. A power cycled fixed it. We have since vlan'd each of the units off.
We have had the NetGear Prosafe running in this MDU for over a year just fine, but the Netonix had some issues at first getting going, no loops as we didn't change any of the wiring just swapped out switches.

When you say you are powering 5 port switches with the the WS-24-400A what switches would they be?
As far as the constant stream of data this could be the issue with UBNT Discovery that was fixed in v1.4.0rcX but being on site with WireShark could easily determine what this data stream is using port mirror to your laptop.

TheHox wrote:Running 1.3.9, we have about 20 switches across our wisp that usually are fine, but something really weird going on here.

I would suggest trying v1.4.0rc12 as there were some fixes with large flat networks which this is sort of.

Now an offhanded suggestion as you never know about pesky wannabe hackers:
Are these switches UI/CLI accessible from the apartments? If so you might want to consider using the Access control list in the switch to block access to the switch UI/CLI

But as I said I would try v1.4.0rcX.
If that does not help I would do as I mentioned above and go on site and recreate the issue and investigate what's going on especially is you can recreate it in about an hour.

Have diagnostic equipment on hand when on site:
Laptop with WireShark
Console cable for switches
Support is handled on the Forums not in Emails and PMs.
Before you ask a question use the Search function to see it has been answered before.
To do an Advanced Search click the magnifying glass in the Search Box.
To upload pictures click the Upload attachment link below the BLUE SUBMIT BUTTON.

User avatar
WisTech
Associate
Associate
 
Posts: 209
Joined: Mon Aug 04, 2014 3:57 pm
Has thanked: 5 times
Been thanked: 63 times

Re: Dropping ports on new WS, what is wrong with my setup?

Tue Apr 26, 2016 2:42 pm

Your setup is super simple too, that's what doesn't make any sense. I was running RSTP on two ports on a local end 6-mini powering a pair of 5Xs on an NxN, as well as a remote 6-mini powering a pair of 5Xs on an NxN. What's even more strange is suddently, I could not access the remote switch, or the remote ips linked up, even if I powered down one of the ports here locally. I powered it back on and planned to power cycle the remote 6-mini which would clear whatever glitch we had. So, at 11:45PM, it created a loop with another local port on the 6-mini and took down an AP, as well as another 5X link powered from a 24 port netonix below that is what is also powering up the local 6-mini on the roof.

24 port powers local 6-mini on the roof that powers a 500AC AP and twin 5Xs with RSTP enabled on both ends (LAG disabled)
24 port also powers multiple 5X radios that are not connecting various segments. It's a hard thing to reproduce, but why in the hell would it happen suddenly at 11:45 when no one was around, and no traffic on the remote of the NxN setup.

User avatar
TheHox
Experienced Member
 
Posts: 107
Joined: Sat Sep 13, 2014 10:59 am
Location: WI
Has thanked: 11 times
Been thanked: 18 times

Re: Dropping ports on new WS, what is wrong with my setup?

Tue Apr 26, 2016 2:51 pm

sirhc wrote:
TheHox wrote:I moved the 2 netonix's directly to the router and bypassed the ProSafe and those issues seemed to go away.

Not sure what to say about this? Is the problem the Netonix or the ProSafe or they just do not like playing together? If you're really curious and you can cause it to happen like clockwork every hour being on site with a console cable if needed and examining the switch logs would be a good start. I would also try this configuration with v1.4.0rc12 for reasons stated.

TheHox wrote:After that, we then noticed other issues,
A 2nd issue I had, was, the Netonix's also power some 5 port switches in each of the 18 units in the MDU, some of them were doing a constant stream of data, like 16kpps solid Ports 2, 6 and 8 in the attached image show the flood. A power cycled fixed it. We have since vlan'd each of the units off.
We have had the NetGear Prosafe running in this MDU for over a year just fine, but the Netonix had some issues at first getting going, no loops as we didn't change any of the wiring just swapped out switches.

When you say you are powering 5 port switches with the the WS-24-400A what switches would they be?
As far as the constant stream of data this could be the issue with UBNT Discovery that was fixed in v1.4.0rcX but being on site with WireShark could easily determine what this data stream is using port mirror to your laptop.

TheHox wrote:Running 1.3.9, we have about 20 switches across our wisp that usually are fine, but something really weird going on here.

I would suggest trying v1.4.0rc12 as there were some fixes with large flat networks which this is sort of.

Now an offhanded suggestion as you never know about pesky wannabe hackers:
Are these switches UI/CLI accessible from the apartments? If so you might want to consider using the Access control list in the switch to block access to the switch UI/CLI

But as I said I would try v1.4.0rcX.
If that does not help I would do as I mentioned above and go on site and recreate the issue and investigate what's going on especially is you can recreate it in about an hour.

Have diagnostic equipment on hand when on site:
Laptop with WireShark
Console cable for switches


We yanked the prosafe out once we had the migration completed. I just had another unifi AP go into isolated state, bounced the port, had no effect. rebooted the switch, and it came back up now. Still on 1.3.9 if it happens again I'll upgrade.
Attachments
switch2.png
switch1.png
switches.png

User avatar
adairw
Associate
Associate
 
Posts: 465
Joined: Wed Nov 05, 2014 11:47 pm
Location: Amarillo, TX
Has thanked: 98 times
Been thanked: 132 times

Re: Dropping ports on new WS, what is wrong with my setup?

Tue Apr 26, 2016 2:52 pm

I've seen that 16K pps and 8Mb stream of traffic start just before the switch locks up. If I can disable to port fast enough it wont lock the switch up. I'll post screen shots later

User avatar
WisTech
Associate
Associate
 
Posts: 209
Joined: Mon Aug 04, 2014 3:57 pm
Has thanked: 5 times
Been thanked: 63 times

Re: Dropping ports on new WS, what is wrong with my setup?

Tue Apr 26, 2016 3:02 pm

adairw wrote:I've seen that 16K pps and 8Mb stream of traffic start just before the switch locks up. If I can disable to port fast enough it wont lock the switch up. I'll post screen shots later


Same here. I actually was able to get into my switch up on the roof and saw the one port that was dead as a doornail pegged at 16kpps and ~8Mbps.

User avatar
sirhc
Employee
Employee
 
Posts: 7347
Joined: Tue Apr 08, 2014 3:48 pm
Location: Lancaster, PA
Has thanked: 1597 times
Been thanked: 1318 times

Re: Dropping ports on new WS, what is wrong with my setup?

Tue Apr 26, 2016 3:29 pm

I am really curious to see if v1.4.0rc12 fixes this.

Please let me know.

If v1.4.0rc12 does not fix it try disabling every service not needed under the Device/Configuration Tab such as:
IGMP Snooping
Discovery
UBNT Discovery
LLDP
Cisco Discovery

Everything then if that fixes it enable 1 at a time until you find it.

But I am hoping v1.4.0rc12 fixes you up.
Support is handled on the Forums not in Emails and PMs.
Before you ask a question use the Search function to see it has been answered before.
To do an Advanced Search click the magnifying glass in the Search Box.
To upload pictures click the Upload attachment link below the BLUE SUBMIT BUTTON.

PreviousNext
Return to Hardware and software issues

Who is online

Users browsing this forum: No registered users and 57 guests