Palo Alto – stale sessions blocking VPN and NetFlow traffic

I have just had to troubleshoot an interesting issue with Palo Alto firewall.

Problem description:

There was a change made to the security rule set on the firewall which unintentionally blocked incoming site-to-site VPN traffic. The problem remained unnoticed for a few hours but as soon as it was reported the configuration change was immediately rolled back to the last known good configuration (if the word “immediately” can be applicable to PA commit process at all :)). The roll back did not make any effect and the VPN tunnel remained down. Confusion was also caused by the fact that out of several tunnels only one went down and could not recover.


Long story short at the end it appeared that there was a “stale” session in DISCARD state that, by looks of it, was constantly refreshed (and thus did not time out) as firewall was trying to reconnect the tunnel. It could not do so because of the network architecture around (only incoming VPN connections were allowed by upstream firewall) but apparently that was enough to make VPN engine ignore incoming requests from the peer it had an outgoing session to in Discard state.

The issue was resolved by resetting the “faulty”session (clear session id <xxx>) and setting the tunnel into passive mode to avoid (hopefully!) re-occurrence of the same glitch in future.


In the context of this particular issue it looked to me like the root cause was an issue/bug within VPN daemon. Though to be fair we had a similar glitch with NetFlow traffic a while ago. At some point the firewall simply stopped passing through NetFlow traffic and the cure was the same session reset which makes me think that there also might be some ongoing issues with session handling.

