Application Performance Monitoring
LogZilla documentation for Application Performance Monitoring
Application Performance Monitoring Correlation
Application performance correlation detects degradation patterns, error clustering, and service dependency issues that span multiple events over time. LogZilla's pre-processing extracts performance metrics, while SEC handles complex threshold analysis and trend correlation.
Prerequisites: Ensure Event Correlation is enabled and forwarder reloading is available as shown in the Event Correlation Overview.
Response Time Degradation Detection
LogZilla Forwarder Configuration
yaml# /etc/logzilla/forwarder.d/app-performance.yaml
type: sec
sec_name: app-performance
rules:
- match:
- field: program
op: "eq"
value: ["nginx", "apache", "webapp", "api-gateway"]
- field: message
op: "=~"
value: "response_time|duration|elapsed"
rewrite:
message: "APP_RESPONSE host=$HOST service=$PROGRAM"
SEC Rule: Response Time Degradation with Recovery
text# Detect application performance issues using message patterns type=SingleWithThreshold ptype=SubStr pattern=APP_RESPONSE context=($ENV{EVENT_MESSAGE} =~ /(slow|timeout|degraded|high.*latency)/) desc=Application performance degradation detected on $ENV{EVENT_HOST} action=eval %service $ENV{EVENT_PROGRAM}; \ eval %host $ENV{EVENT_HOST}; \ shellcmd (logger -t SEC-PERFORMANCE -p local0.warning \ "PERFORMANCE_DEGRADATION service=%service host=%host"); \ shellcmd (logger -t PERF-ALERT "Performance degradation: host=%host service=%service") thresh=10 # 10 performance issues window=300 # within 5 minutes
LogZilla Trigger: Performance Degradation Response
yamlname: "Application Performance Degradation"
filter:
- field: program
op: eq
value: SEC-PERFORMANCE
- field: message
op: "=~"
value: "PERFORMANCE_DEGRADATION"
actions:
exec_script: true
script_path: "/usr/local/bin/performance-degradation-response.sh"
send_webhook: true
send_webhook_template: |
{
"alert_type": "performance_degradation",
"service": "{{event:ut:service}}",
"endpoint": "{{event:ut:endpoint}}",
"host": "{{event:ut:host}}",
"avg_response_time": "{{event:ut:avg_response}}",
"severity": "warning"
}
Intelligent Performance Response Script
bash#!/bin/bash
# /usr/local/bin/performance-degradation-response.sh
SERVICE="$EVENT_USER_TAGS_SERVICE"
ENDPOINT="$EVENT_USER_TAGS_ENDPOINT"
HOST="$EVENT_USER_TAGS_HOST"
AVG_RESPONSE="$EVENT_USER_TAGS_AVG_RESPONSE"
# Query application metrics for additional context
CPU_USAGE=$(curl -s "http://$HOST:9090/metrics" | grep "cpu_usage" | awk '{print $2}')
MEMORY_USAGE=$(curl -s "http://$HOST:9090/metrics" | grep "memory_usage" | awk '{print $2}')
ACTIVE_CONNECTIONS=$(curl -s "http://$HOST:9090/metrics" | grep "active_connections" | awk '{print $2}')
# Determine root cause and appropriate response
if [[ $(echo "$CPU_USAGE > 80" | bc) -eq 1 ]]; then
# High CPU - scale horizontally if possible
logger -t PERF-RESPONSE "High CPU detected on $HOST - attempting auto-scale"
curl -X POST "https://orchestrator.company.com/api/scale-service" \
-d "service=$SERVICE&action=scale_out&reason=high_cpu"
elif [[ $(echo "$MEMORY_USAGE > 90" | bc) -eq 1 ]]; then
# Memory pressure - restart service
logger -t PERF-RESPONSE "Memory pressure on $HOST - scheduling service restart"
curl -X POST "https://orchestrator.company.com/api/restart-service" \
-d "service=$SERVICE&host=$HOST&reason=memory_pressure"
elif [[ $(echo "$ACTIVE_CONNECTIONS > 1000" | bc) -eq 1 ]]; then
# Connection overload - enable rate limiting
logger -t PERF-RESPONSE "Connection overload on $HOST - enabling rate limiting"
curl -X POST "https://loadbalancer.company.com/api/rate-limit" \
-d "service=$SERVICE&limit=100&duration=300"
else
# Unknown cause - create investigation ticket
logger -t PERF-RESPONSE "Unknown performance issue on $HOST - creating ticket"
curl -X POST "https://servicedesk.company.com/api/tickets" \
-d "priority=medium&subject=Performance Investigation&service=$SERVICE&host=$HOST"
fi
# Update performance dashboard
curl -X POST "https://dashboard.company.com/api/performance-events" \
-d "service=$SERVICE&endpoint=$ENDPOINT&host=$HOST&response_time=$AVG_RESPONSE&status=degraded"
Error Rate Correlation
HTTP Error Clustering Detection
Correlate HTTP error patterns to detect application issues.
SEC Rule: Error Rate Spike Detection
text# Detect HTTP error rate spikes with error type classification type=SingleWithThreshold ptype=SubStr pattern=APP_RESPONSE context=($ENV{EVENT_USER_TAGS_HTTP_STATUS} =~ /^[45]/) desc=HTTP error rate spike detected action=eval %service $ENV{EVENT_PROGRAM}; \ eval %host $ENV{EVENT_HOST}; \ eval %error_4xx (shellcmd_output("grep 'APP_RESPONSE.*status=4' /var/log/sec/performance.log | tail -100 | wc -l")); \ eval %error_5xx (shellcmd_output("grep 'APP_RESPONSE.*status=5' /var/log/sec/performance.log | tail -100 | wc -l")); \ if (%error_5xx > %error_4xx) { \ eval %error_type "server_error"; \ } else { \ eval %error_type "client_error"; \ } \ shellcmd (logger -t SEC-PERFORMANCE -p local0.alert \ "ERROR_RATE_SPIKE service=%service host=%host error_type=%error_type 4xx_count=%error_4xx 5xx_count=%error_5xx") thresh=50 # 50 errors window=300 # within 5 minutes
Database Performance Correlation
Generic Application Error Correlation
Note: This example uses message pattern matching since no database parsing apps exist in LogZilla by default. To parse specific database fields, you would need to create custom parsing rules.
LogZilla Forwarder Configuration
yaml# /etc/logzilla/forwarder.d/app-error-correlation.yaml
type: sec
sec_name: app-errors
rules:
- match:
- field: program
op: "eq"
value: ["webapp", "api-server", "microservice"]
- field: message
op: "=~"
value: "(ERROR|FATAL|Exception|timeout|slow.*query)"
rewrite:
message: "APP_ERROR host=$HOST service=$PROGRAM severity=$SEVERITY"
SEC Rule: Application Error Correlation
text# Correlate application errors without high-cardinality user tags type=SingleWithThreshold ptype=SubStr pattern=APP_ERROR context=($ENV{EVENT_SEVERITY} <= 3) # Error level and above desc=Application error spike detected on $ENV{EVENT_HOST} action=eval %host $ENV{EVENT_HOST}; \ eval %service $ENV{EVENT_PROGRAM}; \ shellcmd (logger -t SEC-ALERT -p local0.alert \ "ERROR_SPIKE_DETECTED host=%host service=%service"); \ shellcmd (/usr/local/bin/app-error-response.sh "%host" "%service") thresh=10 # 10 errors window=300 # within 5 minutes
Service Dependency Correlation
Cascading Service Failure Detection
Detect when upstream service failures cause downstream performance issues.
SEC Rule: Service Dependency Correlation
text# Create context for upstream service failures type=Single ptype=SubStr pattern=PERFORMANCE_DEGRADATION context=($ENV{EVENT_USER_TAGS_SERVICE} =~ /^(auth|payment|inventory)/) desc=Upstream service degradation detected action=eval %upstream_service $ENV{EVENT_USER_TAGS_SERVICE}; \ create UPSTREAM_DEGRADED_%upstream_service 1800; \ shellcmd (logger -t SEC-DEPENDENCY "UPSTREAM_ISSUE service=\"%upstream_service\"") # Detect downstream impact type=Single ptype=SubStr pattern=PERFORMANCE_DEGRADATION context=(UPSTREAM_DEGRADED_auth || UPSTREAM_DEGRADED_payment || UPSTREAM_DEGRADED_inventory) && \ ($ENV{EVENT_USER_TAGS_SERVICE} =~ /^(web|api|mobile)/) desc=Downstream service impact from upstream degradation action=eval %downstream_service $ENV{EVENT_USER_TAGS_SERVICE}; \ eval %upstream_cause (if (UPSTREAM_DEGRADED_auth) { "auth"; } elsif (UPSTREAM_DEGRADED_payment) { "payment"; } else { "inventory"; }); \ shellcmd (logger -t SEC-DEPENDENCY -p local0.warning \ "CASCADING_FAILURE downstream=%downstream_service upstream_cause=%upstream_cause")
Application Resource Correlation
Memory Leak Detection
Correlate memory usage patterns with application performance over time.
SEC Rule: Memory Leak Detection
text# Track memory usage trends over time type=SingleWithScript ptype=SubStr pattern=MEMORY_USAGE script=/usr/local/bin/check-memory-trend.sh $ENV{EVENT_HOST} $ENV{EVENT_USER_TAGS_MEMORY_USAGE} desc=Memory leak detected on $ENV{EVENT_HOST} action=eval %host $ENV{EVENT_HOST}; \ eval %memory_usage $ENV{EVENT_USER_TAGS_MEMORY_USAGE}; \ eval %service $ENV{EVENT_USER_TAGS_SERVICE}; \ shellcmd (logger -t SEC-PERFORMANCE -p local0.alert \ "MEMORY_LEAK_DETECTED host=%host service=%service memory_usage=%memory_usage") action2=logonly # Normal memory usage
Memory Trend Analysis Script
bash#!/bin/bash
# /usr/local/bin/check-memory-trend.sh
HOST="$1"
CURRENT_MEMORY="$2"
# Get memory usage history for the last hour
MEMORY_HISTORY="/tmp/memory_history_$HOST"
echo "$(date +%s) $CURRENT_MEMORY" >> "$MEMORY_HISTORY"
# Keep only last hour of data
awk -v cutoff=$(($(date +%s) - 3600)) '$1 >= cutoff' "$MEMORY_HISTORY" > "${MEMORY_HISTORY}.tmp"
mv "${MEMORY_HISTORY}.tmp" "$MEMORY_HISTORY"
# Calculate memory growth rate
MEMORY_SAMPLES=$(wc -l < "$MEMORY_HISTORY")
if [[ $MEMORY_SAMPLES -ge 10 ]]; then
# Calculate linear regression slope for memory growth
SLOPE=$(awk '{
sum_x += NR; sum_y += $2; sum_xy += NR * $2; sum_x2 += NR * NR; n++
} END {
slope = (n * sum_xy - sum_x * sum_y) / (n * sum_x2 - sum_x * sum_x)
print slope
}' "$MEMORY_HISTORY")
# If memory is growing consistently (slope > 1MB/minute)
if [[ $(echo "$SLOPE > 1048576" | bc) -eq 1 ]]; then
exit 0 # Memory leak detected
fi
fi
exit 1 # Normal memory usage