Bgp Network Monitoring Correlation
LogZilla documentation for Bgp Network Monitoring Correlation
BGP Network Monitoring and Correlation
BGP (Border Gateway Protocol) monitoring requires correlating adjacency changes, outage duration tracking, and network stability analysis. LogZilla's pre-processing extracts BGP-specific fields, while SEC handles complex time-based correlation patterns that triggers cannot manage alone.
Prerequisites: Ensure Event Correlation is enabled and forwarder reloading is available as shown in the Event Correlation Overview.
BGP Adjacency Outage Tracking
Forwarder Configuration (Adjacency)
Required App: cisco__asa
app (for SrcIP
user tag)
yaml# /etc/logzilla/forwarder.d/bgp-monitoring.yaml
type: sec
sec_name: bgp-correlation
rules:
- match:
- field: cisco_mnemonic
op: "eq"
value: BGP-5-ADJCHANGE
rewrite:
message: "BGP_EVENT host=$HOST neighbor=$USER_TAGS_SRCIP"
SEC Rule: BGP Outage Duration Calculation
Based on real kwiktrip implementation for DMVPN tunnel monitoring:
text# Track BGP neighbor down events and calculate outage duration type=Pair ptype=SubStr pattern=BGP_EVENT context=($ENV{EVENT_CISCO_MNEMONIC} eq "BGP-5-ADJCHANGE") && \ ($ENV{EVENT_MESSAGE} =~ /Down/) desc=BGP Neighbor Down on $ENV{EVENT_HOST} action=eval %neighbor_ip $ENV{EVENT_USER_TAGS_SRCIP}; \ eval %router_host $ENV{EVENT_HOST}; \ eval %down_time (time()); \ shellcmd (/usr/bin/host %neighbor_ip > /tmp/bgp-lookup-%neighbor_ip 2>/dev/null); \ eval %hostname (readfile("/tmp/bgp-lookup-%neighbor_ip")); \ if (%hostname =~ /domain name pointer ([\w\.\-]+)/) { \ eval %store_name "$1"; \ } else { \ eval %store_name "%neighbor_ip"; \ } \ if (%store_name =~ /(\d+)lo/) { \ eval %store_number "$1"; \ } else { \ eval %store_number "unknown"; \ } \ if (%store_name =~ /\d+lo(\d+)/) { \ eval %tunnel_id "$1"; \ } else { \ eval %tunnel_id "unknown"; \ } ptype2=SubStr pattern2=BGP_EVENT context2=($ENV{EVENT_MESSAGE} =~ /BGP.*Up/) && \ ($ENV{EVENT_USER_TAGS_SRCIP} eq "%neighbor_ip") && \ ($ENV{EVENT_HOST} eq "%router_host") desc2=BGP Neighbor %neighbor_ip Up on %router_host action2=eval %up_time (time()); \ eval %outage_duration (%up_time - %down_time); \ shellcmd (logger -t SEC-BGP \ "BGP_OUTAGE_RESOLVED hostname=\"%store_name\" store=\"%store_number\" tunnel=\"%tunnel_id\" downtime=\"%outage_duration\""); \ delete /tmp/bgp-lookup-%neighbor_ip
LogZilla Trigger: BGP Outage Response
yamlname: "BGP Outage Business Impact"
filter:
- field: program
op: eq
value: SEC-BGP
- field: message
op: "=~"
value: "BGP_OUTAGE_RESOLVED"
actions:
exec_script: true
script_path: "/usr/local/bin/bgp-outage-analysis.sh"
send_webhook: true
send_webhook_template: |
{
"event_type": "bgp_outage_resolved",
"store_name": "{{event:ut:hostname}}",
"store_number": "{{event:ut:store}}",
"tunnel_id": "{{event:ut:tunnel}}",
"outage_duration": "{{event:ut:downtime}}",
"severity": "warning"
}
Business Intelligence Script
bash#!/bin/bash
# /usr/local/bin/bgp-outage-analysis.sh
STORE_NAME="$EVENT_USER_TAGS_HOSTNAME"
STORE_NUMBER="$EVENT_USER_TAGS_STORE"
TUNNEL_ID="$EVENT_USER_TAGS_TUNNEL"
DOWNTIME="$EVENT_USER_TAGS_DOWNTIME"
# Convert seconds to human readable
HOURS=$((DOWNTIME / 3600))
MINUTES=$(((DOWNTIME % 3600) / 60))
DURATION_TEXT="${HOURS}h ${MINUTES}m"
# Query business impact based on store type and tunnel
STORE_TYPE=$(curl -s "https://api.company.com/stores/$STORE_NUMBER/type")
BUSINESS_IMPACT="low"
# Tunnel 241 gets special handling (critical business locations)
if [[ "$TUNNEL_ID" == "241" ]]; then
BUSINESS_IMPACT="high"
# Alert management for critical tunnel outages > 4 hours
if [[ "$DOWNTIME" -gt 14400 ]]; then
curl -X POST "https://slack.company.com/api/webhooks/management" \
-d "text=CRITICAL: Store $STORE_NUMBER (Tunnel 241) was down for $DURATION_TEXT"
fi
elif [[ "$STORE_TYPE" == "flagship" ]]; then
BUSINESS_IMPACT="medium"
fi
# Log business metrics
logger -t BGP-BUSINESS "Store: $STORE_NUMBER, Duration: $DURATION_TEXT, Impact: $BUSINESS_IMPACT"
# Update network operations dashboard
curl -X POST "https://dashboard.company.com/api/bgp-events" \
-d "store=$STORE_NUMBER&duration=$DOWNTIME&impact=$BUSINESS_IMPACT&tunnel=$TUNNEL_ID"
BGP Flapping Detection
Multi-Event BGP State Correlation
Detect BGP neighbors that change state multiple times within a short period.
SEC Rule: BGP Flapping Detection
text# Count BGP state changes per neighbor within time window type=SingleWithThreshold ptype=SubStr pattern=BGP_EVENT desc=BGP flapping detected for neighbor $ENV{EVENT_USER_TAGS_SRCIP} action=eval %neighbor_ip $ENV{EVENT_USER_TAGS_SRCIP}; \ eval %router_host $ENV{EVENT_HOST}; \ eval %flap_count $thresh; \ shellcmd (logger -t SEC-BGP -p local0.warning \ "BGP_FLAPPING neighbor=\"%neighbor_ip\" router=\"%router_host\" flap_count=\"%flap_count\"") thresh=6 # 6 state changes window=1800 # within 30 minutes
BGP Route Advertisement Monitoring
Route Withdrawal/Advertisement Correlation
Monitor BGP route advertisements and detect routing instability.
Forwarder Configuration (Routes)
yaml# /etc/logzilla/forwarder.d/bgp-routes.yaml
type: sec
sec_name: bgp-routes
rules:
- match:
- field: cisco_mnemonic
op: "eq"
value: ["BGP-5-ADJCHANGE", "BGP-4-MAXPFX", "BGP-3-NOTIFICATION"]
rewrite:
message: "BGP_ROUTE_EVENT host=$HOST neighbor=$USER_TAGS_SRCIP event_type=$CISCO_MNEMONIC"
SEC Rule: Route Instability Detection
text# Detect excessive route withdrawals/advertisements type=SingleWithThreshold ptype=SubStr pattern=BGP_ROUTE_EVENT context=($ENV{EVENT_CISCO_MNEMONIC} eq "BGP-4-MAXPFX") desc=BGP route instability detected action=eval %neighbor_ip $ENV{EVENT_USER_TAGS_SRCIP}; \ eval %router_host $ENV{EVENT_HOST}; \ eval %prefix_count $thresh; \ shellcmd (logger -t SEC-BGP -p local0.alert \ "BGP_ROUTE_INSTABILITY neighbor=\"%neighbor_ip\" router=\"%router_host\" prefix_count=\"%prefix_count\"") thresh=1000 # 1000 prefix changes window=300 # within 5 minutes
Multi-Router BGP Correlation
Network-Wide BGP Event Correlation
Detect BGP events affecting multiple routers simultaneously, indicating upstream provider issues.
SEC Rule: Network-Wide BGP Correlation
text# Detect BGP events affecting multiple routers type=SingleWithThreshold ptype=SubStr pattern=BGP_EVENT context=($ENV{EVENT_CISCO_MNEMONIC} eq "BGP-5-ADJCHANGE") && \ ($ENV{EVENT_MESSAGE} =~ /Down/) desc=Network-wide BGP issue detected action=eval %affected_routers (shellcmd_output("grep 'BGP_EVENT.*Down' /var/log/sec/bgp-events.log | tail -20 | awk '{print $2}' | sort -u | wc -l")); \ if (%affected_routers >= 5) { \ shellcmd (logger -t SEC-BGP -p local0.crit \ "BGP_NETWORK_OUTAGE affected_routers=\"%affected_routers\""); \ } thresh=10 # 10 BGP down events window=300 # within 5 minutes
LogZilla Trigger: Network Outage Response
yamlname: "Network-Wide BGP Outage"
filter:
- field: program
op: eq
value: SEC-BGP
- field: message
op: "=~"
value: "BGP_NETWORK_OUTAGE"
actions:
send_email: true
send_email_template: |
Subject: CRITICAL: Network-Wide BGP Outage
Network-wide BGP outage detected affecting multiple routers.
Host: {{event:host}}
Message: {{event:message}}
This indicates a potential upstream provider issue.
Escalate to network operations immediately.
issue_notification: true
BGP Security Monitoring
BGP Hijack Detection
Monitor for unexpected BGP route advertisements that could indicate hijacking.
SEC Rule: BGP Hijack Detection
text# Monitor for unexpected route origins type=Single ptype=SubStr pattern=BGP_ROUTE_EVENT context=($ENV{EVENT_MESSAGE} =~ /(10\.|172\.|192\.168\.)/) && \ ($ENV{EVENT_MESSAGE} !~ /AS65001/) desc=Potential BGP hijack detected action=eval %neighbor_ip $ENV{EVENT_USER_TAGS_SRCIP}; \ eval %host $ENV{EVENT_HOST}; \ shellcmd (logger -t SEC-BGP -p local0.crit \ "BGP_HIJACK_SUSPECTED host=\"%host\" neighbor=\"%neighbor_ip\" message=\"$ENV{EVENT_MESSAGE}\"")