Warm boot breaks switch CPU ping on tagged VLANs

DOWNLOAD THE LATEST FIRMWARE HERE
User avatar
yahel
Experienced Member
 
Posts: 100
Joined: Wed May 27, 2015 12:07 am
Location: Berkeley, CA
Has thanked: 22 times
Been thanked: 22 times

Warm boot breaks switch CPU ping on tagged VLANs

Sat Apr 11, 2026 4:29 pm

Subject: Warm boot breaks switch CPU ping on tagged VLANs (WS-26-500-DC, data plane fine)
We hit a nasty issue after rebooting a WS-26-500-DC that had been up for over a year (memory leak forced the reboot). Posting in case others run into this.
The problem:
After a warm reboot (CLI
reboot, not power cycle), the switch comes back up and "Detected warm boot" appears in syslog. All port-to-port traffic works perfectly — subscribers are fine, VLANs are forwarding, LACP comes back up. But the switch itself cannot ping anything on tagged VLANs. Management VLAN (native/untagged) works fine — SSH and HTTPS are accessible.
So the data plane is healthy but the switch's own IP stack is disconnected from the ASIC on tagged VLANs.
The impact:
Every PingWatchdog targeting a device on a tagged VLAN starts failing, even though the devices are physically up and passing traffic. With "Bounce Power" watchdogs, this means the switch starts power-cycling radios that are perfectly fine. In our case, 11 watchdogs fired every ~16 minutes for 4+ hours before we caught it — 176 unnecessary PoE bounces.
What we found in the logs:
After the warm boot, the VLANs are added to the firewall zone sequentially (the usual "adding vlanXXXX (eth0.XXXX) to firewall zone lan" messages). But the CPU-to-ASIC path for those interfaces doesn't actually work. The Linux VLAN sub-interfaces exist (you can see them with
ip addr show) but can't send or receive through the switch chip.
What fixed it:
A config apply through the web UI that triggered a full VLAN teardown-and-rebuild cycle. We changed a VLAN IP setting, applied, and the rebuild (remove all + re-add all VLAN interfaces) restored the CPU-to-ASIC mapping. A cold reboot (full power cycle) would also fix it, but we were able to avoid that.
What didn't fix it:
Just waiting — the issue persisted for 7+ hours with no self-healing until the config apply.
Setup:
WS-26-500-DC, 15 VLANs, LACP trunk (4 ports) to a MikroTik router, ~20 active ports. Switch had >1 year uptime before the reboot. Two UI reboot attempts failed before CLI reboot succeeded (probably related to the memory leak).
Suggestion:
The warm boot path seems to not properly re-program the ASIC's CPU port VLAN membership. Could the firmware force a cold ASIC reinit after a warm boot is detected? Or at minimum, re-apply the CPU port VLAN configuration after the "Detected warm boot" event?

Return to Hardware and software issues

Who is online

Users browsing this forum: No registered users and 11 guests