# Error Correlation Guide

This guide covers the error correlation analysis features, including release correlation, user correlation, and time-based correlation.

**⚙️ Optional Feature** - Error correlation is disabled by default. Enable it in your initializer:

```ruby
RailsErrorDashboard.configure do |config|
  config.enable_error_correlation = true
end
```

## Table of Contents

- [Overview](#overview)
- [Release Correlation](#release-correlation)
- [User Correlation](#user-correlation)
- [Time-Based Correlation](#time-based-correlation)
- [Period Comparison](#period-comparison)
- [Use Cases](#use-cases)
- [Configuration](#configuration)
- [Best Practices](#best-practices)

## Overview

Error correlation helps you answer critical questions:
- **Release Impact**: Did this version introduce more errors?
- **User Impact**: Are the same users experiencing multiple issues?
- **Temporal Patterns**: Do certain errors occur together?
- **Trend Analysis**: Are errors increasing or decreasing?

### Why Correlation Matters

Understanding correlations enables:
- **Release Quality**: Identify problematic releases quickly
- **User Experience**: Find users most impacted by errors
- **Root Cause Analysis**: Discover hidden relationships between errors
- **Proactive Fixes**: Address issues before they escalate

## Release Correlation

### What is Release Correlation?

Release correlation links errors to specific app versions or git commits, helping you identify which releases introduced errors.

### Errors by App Version

Track errors per version to spot problematic releases:

```ruby
correlation = RailsErrorDashboard::Queries::ErrorCorrelation.new(days: 30)
versions = correlation.errors_by_version

versions.each do |version, data|
  puts "Version: #{version}"
  puts "  Total Errors: #{data[:count]}"
  puts "  Error Types: #{data[:error_types]}"
  puts "  Critical Errors: #{data[:critical_count]}"
  puts "  Platforms: #{data[:platforms].join(', ')}"
  puts "  First Seen: #{data[:first_seen]}"
  puts "  Last Seen: #{data[:last_seen]}"
end
```

**Output**:
```text
Version: 2.1.0
  Total Errors: 450
  Error Types: 23
  Critical Errors: 12
  Platforms: iOS, Android
  First Seen: 2025-12-20 09:00:00
  Last Seen: 2025-12-25 14:30:00

Version: 2.0.5
  Total Errors: 120
  Error Types: 8
  Critical Errors: 2
  Platforms: iOS, Android
  First Seen: 2025-12-15 10:00:00
  Last Seen: 2025-12-24 16:00:00
```

**Metrics Explained**:
- **Total Errors**: Sum of all error occurrences for this version
- **Error Types**: Number of distinct error types (e.g., NoMethodError, ArgumentError)
- **Critical Errors**: Count of high-severity errors
- **Platforms**: Which platforms are running this version
- **First/Last Seen**: Time range for this version's errors

### Errors by Git SHA

For more granular tracking, correlate errors with specific commits:

```ruby
shas = correlation.errors_by_git_sha

shas.each do |sha, data|
  puts "Commit: #{sha[0..7]}"
  puts "  Total Errors: #{data[:count]}"
  puts "  Error Types: #{data[:error_types]}"
  puts "  App Versions: #{data[:app_versions].join(', ')}"
end
```

**Use Case**: Identify the exact commit that introduced a bug.

**Example**:
```text
Commit: a3b4c5d6
  Total Errors: 280
  Error Types: 15
  App Versions: 2.1.0, 2.1.1

Commit: e7f8g9h0
  Total Errors: 45
  Error Types: 5
  App Versions: 2.1.0
```

**Interpretation**: Commit `a3b4c5d6` is associated with significantly more errors → likely contains a bug.

### Problematic Releases

Automatically identify releases with abnormally high error rates:

```ruby
problematic = correlation.problematic_releases

problematic.each do |release|
  puts "⚠️  Version #{release[:version]}"
  puts "   Errors: #{release[:error_count]}"
  puts "   Deviation: +#{release[:deviation_from_avg]}% from average"
  puts "   Critical: #{release[:critical_count]}"
  puts "   Error Types: #{release[:error_types]}"
end
```

**Algorithm**:
1. Calculate average errors per version
2. Flag versions with >2x the average
3. Sort by error count (worst first)

**Output**:
```text
⚠️  Version 2.1.0
   Errors: 450
   Deviation: +275% from average
   Critical: 12
   Error Types: 23
```

**Interpretation**: Version 2.1.0 has 275% more errors than average → **rollback or hotfix urgently**.

### Configuration Requirements

To enable release correlation, errors must include version metadata:

```ruby
# When logging errors
RailsErrorDashboard::Commands::LogError.call(
  error_type: "NoMethodError",
  message: "undefined method 'name'",
  backtrace: [...],
  app_version: "2.1.0",        # Required for version correlation
  git_sha: "a3b4c5d6e7f8",     # Optional, for commit correlation
  # ...
)
```

**Database Columns**:
- `app_version` (string) - Semantic version (e.g., "2.1.0")
- `git_sha` (string) - Full git commit hash

If these columns are missing, correlation queries return empty results.

## User Correlation

### What is User Correlation?

User correlation identifies users experiencing multiple error types, helping you understand user impact and prioritize fixes.

### Multi-Error Users

Find users affected by multiple different error types:

```ruby
correlation = RailsErrorDashboard::Queries::ErrorCorrelation.new(days: 30)
multi_error_users = correlation.multi_error_users(min_error_types: 2)

multi_error_users.each do |user_data|
  puts "User: #{user_data[:user_email]}"
  puts "  Error Types: #{user_data[:error_type_count]}"
  puts "  Total Errors: #{user_data[:total_errors]}"
  puts "  Types: #{user_data[:error_types].join(', ')}"
end
```

**Output**:
```text
User: user@example.com
  Error Types: 5
  Total Errors: 23
  Types: NoMethodError, ArgumentError, NetworkError, TimeoutError, ValidationError

User: another@example.com
  Error Types: 3
  Total Errors: 12
  Types: PaymentError, DatabaseError, CacheError
```

**Interpretation**:
- `user@example.com` experienced 5 different error types → **severely impacted**
- They hit 23 total errors → **frustrating experience**
- Reach out to this user, investigate their usage patterns

**Parameters**:
- `min_error_types` (default: 2) - Minimum distinct error types to include

### Error Type User Overlap

Calculate how many users experience two specific error types:

```ruby
overlap = correlation.error_type_user_overlap(
  "NoMethodError",
  "ArgumentError"
)

puts "Users with NoMethodError: #{overlap[:users_a_count]}"
puts "Users with ArgumentError: #{overlap[:users_b_count]}"
puts "Users with both: #{overlap[:overlap_count]}"
puts "Overlap percentage: #{overlap[:overlap_percentage]}%"
puts "Sample users: #{overlap[:overlapping_user_ids].join(', ')}"
```

**Output**:
```text
Users with NoMethodError: 150
Users with ArgumentError: 120
Users with both: 80
Overlap percentage: 66.7%
Sample users: 45, 67, 89, 102, 134, 156, 178, 190, 203, 245
```

**Interpretation**:
- 66.7% of users with ArgumentError also get NoMethodError
- High overlap suggests **related errors** or **common user path**
- Fixing NoMethodError may also reduce ArgumentError

**Use Cases**:
- Identify cascading errors (one error causes another)
- Find common user workflows triggering multiple errors
- Prioritize fixes with highest user impact

### Configuration Requirements

User correlation requires:
- `user_id` column in error_logs table
- User model with `email` attribute (configurable)

```ruby
# config/initializers/rails_error_dashboard.rb
RailsErrorDashboard.configure do |config|
  config.user_model = "User"  # Your user model class name
end

# When logging errors
RailsErrorDashboard::Commands::LogError.call(
  error_type: "NoMethodError",
  user_id: current_user.id,  # Required for user correlation
  # ...
)
```

## Time-Based Correlation

### What is Time-Based Correlation?

Time-based correlation finds error types that occur at similar times of day, suggesting shared root causes or triggers.

### Analyzing Time Correlation

```ruby
correlation = RailsErrorDashboard::Queries::ErrorCorrelation.new(days: 30)
time_correlated = correlation.time_correlated_errors

time_correlated.each do |pair_name, data|
  puts "#{data[:error_type_a]} ↔ #{data[:error_type_b]}"
  puts "  Correlation: #{(data[:correlation] * 100).round}%"
  puts "  Strength: #{data[:strength]}"
end
```

**Output**:
```text
NoMethodError ↔ DatabaseError
  Correlation: 85%
  Strength: strong

TimeoutError ↔ NetworkError
  Correlation: 92%
  Strength: strong

PaymentError ↔ ValidationError
  Correlation: 62%
  Strength: moderate
```

**Interpretation**:
- NoMethodError and DatabaseError occur at similar hours (85% correlation)
- Likely **shared root cause**: database issues trigger NoMethodError
- TimeoutError and NetworkError also correlate (92%)
- Both probably caused by **network infrastructure issues**

### Correlation Algorithm

Uses **Pearson correlation coefficient** to measure similarity:

```ruby
def calculate_time_correlation(series_a, series_b)
  # series_a and series_b are arrays of 24 elements (hourly counts)

# Calculate means
  mean_a = series_a.sum.to_f / 24
  mean_b = series_b.sum.to_f / 24

# Calculate covariance and standard deviations
  covariance = 0.0
  std_a = 0.0
  std_b = 0.0

24.times do |i|
    diff_a = series_a[i] - mean_a
    diff_b = series_b[i] - mean_b
    covariance += diff_a * diff_b
    std_a += diff_a**2
    std_b += diff_b**2
  end

# Pearson correlation
  correlation = covariance / Math.sqrt(std_a * std_b)
  correlation.round(3)
end
```

**Correlation Values**:
- **1.0** = Perfect positive correlation (always occur together)
- **0.5-1.0** = Strong correlation (often occur together)
- **0.0-0.5** = Weak correlation (sometimes occur together)
- **-1.0-0.0** = Negative correlation (occur at opposite times)

**Strength Classification**:
- **Strong**: correlation >= 0.8
- **Moderate**: correlation >= 0.5
- **Weak**: correlation < 0.5 (filtered out by default)

### Use Cases for Time Correlation

**1. Identify Cascading Failures**:
```text
DatabaseError ↔ CacheError (90% correlation)
→ Database issues cause cache to fail
→ Fix database to resolve both
```

**2. Find Shared Dependencies**:
```text
PaymentError ↔ ExternalAPIError (85% correlation)
→ Both depend on third-party payment API
→ Add retry logic for API timeouts
```

**3. Detect Infrastructure Issues**:
```text
All errors spike at 2 AM (high correlation)
→ Nightly backup job causing resource contention
→ Reschedule backup or scale infrastructure
```

## Period Comparison

### What is Period Comparison?

Period comparison shows how error rates changed between two time periods, helping identify trends.

### Comparing Periods

```ruby
correlation = RailsErrorDashboard::Queries::ErrorCorrelation.new(days: 30)
comparison = correlation.period_comparison

puts "Current Period (last #{days/2} days):"
puts "  Errors: #{comparison[:current_period][:count]}"
puts "  #{comparison[:current_period][:start]} to #{comparison[:current_period][:end]}"

puts "Previous Period (#{days/2} days before):"
puts "  Errors: #{comparison[:previous_period][:count]}"
puts "  #{comparison[:previous_period][:start]} to #{comparison[:previous_period][:end]}"

puts "Change: #{comparison[:change]} (#{comparison[:change_percentage]}%)"
puts "Trend: #{comparison[:trend]}"
```

**Output**:
```text
Current Period (last 15 days):
  Errors: 1850
  2025-12-10 to 2025-12-25

Previous Period (15 days before):
  Errors: 1200
  2025-11-25 to 2025-12-09

Change: +650 (+54.2%)
Trend: increasing_significantly
```

### Trend Classification

Trends are classified based on percentage change:

| Trend | Change | Meaning |
|-------|--------|---------|
| **decreasing_significantly** | < -20% | Major improvement |
| **decreasing** | -20% to -5% | Moderate improvement |
| **stable** | -5% to +5% | No significant change |
| **increasing** | +5% to +20% | Moderate concern |
| **increasing_significantly** | > +20% | Major concern |

**Use Cases**:
- **Post-release monitoring**: Check if errors increased after deployment
- **Improvement tracking**: Verify that fixes are working
- **Capacity planning**: Identify long-term growth trends

## Use Cases

### Scenario 1: Post-Release Validation

**Question**: "Did v2.1.0 introduce new errors?"

**Analysis**:
```ruby
correlation = ErrorCorrelation.new(days: 7)
versions = correlation.errors_by_version

v210 = versions["2.1.0"]
v205 = versions["2.0.5"]

puts "v2.1.0: #{v210[:count]} errors, #{v210[:error_types]} types"
puts "v2.0.5: #{v205[:count]} errors, #{v205[:error_types]} types"
puts "Increase: +#{v210[:count] - v205[:count]} errors"
```

**Interpretation**:
```text
v2.1.0: 450 errors, 23 types
v2.0.5: 120 errors, 8 types
Increase: +330 errors
```

**Conclusion**: v2.1.0 introduced 330 more errors and 15 new error types → **rollback or hotfix immediately**.

### Scenario 2: Identifying Power Users vs Affected Users

**Question**: "Who should we reach out to?"

**Analysis**:
```ruby
multi_error_users = correlation.multi_error_users(min_error_types: 3)

multi_error_users.first(5).each do |user|
  puts "#{user[:user_email]}: #{user[:error_type_count]} types, #{user[:total_errors]} errors"
end
```

**Output**:
```text
power_user@example.com: 7 types, 45 errors
affected_user@example.com: 5 types, 23 errors
casual_user@example.com: 3 types, 8 errors
```

**Action**:
1. Reach out to `power_user@example.com` (most impacted)
2. Offer compensation or support
3. Gather feedback on their experience
4. Investigate their usage patterns

### Scenario 3: Root Cause via Time Correlation

**Question**: "Why do these errors always occur together?"

**Analysis**:
```ruby
time_correlated = correlation.time_correlated_errors

# Find errors correlated with DatabaseError
database_correlated = time_correlated.select do |_, data|
  data[:error_type_a] == "DatabaseError" || data[:error_type_b] == "DatabaseError"
end

database_correlated.each do |_, data|
  other_error = data[:error_type_a] == "DatabaseError" ? data[:error_type_b] : data[:error_type_a]
  puts "#{other_error}: #{(data[:correlation] * 100).round}% correlation"
end
```

**Output**:
```text
CacheError: 90% correlation
NoMethodError: 85% correlation
TimeoutError: 78% correlation
```

**Conclusion**:
- DatabaseError triggers CacheError (cache depends on database)
- DatabaseError causes NoMethodError (missing data = nil.method)
- DatabaseError leads to TimeoutError (retry logic)

**Action**: **Fix DatabaseError** to resolve all correlated errors.

### Scenario 4: Trend Monitoring

**Question**: "Are we improving or getting worse?"

**Analysis**:
```ruby
# Weekly checks
week1 = ErrorCorrelation.new(days: 7).period_comparison
week2 = ErrorCorrelation.new(days: 14).period_comparison
month = ErrorCorrelation.new(days: 30).period_comparison

puts "This week: #{week1[:trend]} (#{week1[:change_percentage]}%)"
puts "Last 2 weeks: #{week2[:trend]} (#{week2[:change_percentage]}%)"
puts "Last month: #{month[:trend]} (#{month[:change_percentage]}%)"
```

**Output**:
```text
This week: decreasing (-12%)
Last 2 weeks: decreasing (-8%)
Last month: stable (+2%)
```

**Interpretation**: Recent fixes are working (decreasing trend), but long-term trend is stable.

### Scenario 5: A/B Test Impact

**Question**: "Did the new feature increase errors?"

**Analysis**:
```ruby
# Assume feature deployed in v2.1.0
correlation = ErrorCorrelation.new(days: 14)

# Users on new version
v210_errors = correlation.errors_by_version["2.1.0"]

# Users still on old version
v205_errors = correlation.errors_by_version["2.0.5"]

puts "New version (v2.1.0): #{v210_errors[:count]} errors"
puts "Old version (v2.0.5): #{v205_errors[:count]} errors"
```

**Interpretation**:
- If v2.1.0 has significantly more errors → feature has bugs
- If similar → feature is stable
- Compare error types to see if new errors introduced

## Configuration

### Required Columns

For full correlation analysis, ensure these columns exist:

```ruby
# db/migrate/XXXXXX_add_correlation_columns.rb
class AddCorrelationColumns < ActiveRecord::Migration[7.0]
  def change
    add_column :rails_error_dashboard_error_logs, :app_version, :string
    add_column :rails_error_dashboard_error_logs, :git_sha, :string
    add_column :rails_error_dashboard_error_logs, :user_id, :integer

add_index :rails_error_dashboard_error_logs, :app_version
    add_index :rails_error_dashboard_error_logs, :git_sha
    add_index :rails_error_dashboard_error_logs, :user_id
  end
end
```

### User Model Configuration

```ruby
# config/initializers/rails_error_dashboard.rb
RailsErrorDashboard.configure do |config|
  config.user_model = "User"  # Your user model class
end
```

### Logging Errors with Correlation Data

```ruby
# In your error handler
RailsErrorDashboard::Commands::LogError.call(
  error_type: error.class.name,
  message: error.message,
  backtrace: error.backtrace,

# Release correlation
  app_version: ENV['APP_VERSION'] || "unknown",
  git_sha: ENV['GIT_SHA'] || `git rev-parse HEAD`.strip,

# User correlation
  user_id: current_user&.id,

# Other metadata
  platform: request.user_agent =~ /iPhone/ ? "iOS" : "Android",
  # ...
)
```

## Best Practices

### 1. Track App Version Consistently

**Use semantic versioning**:
```text
✓ Good: "2.1.0", "2.1.1", "2.2.0"
✗ Bad: "v2.1", "2.1-beta", "latest"
```

**Set in environment**:
```ruby
# config/application.rb
config.app_version = ENV['APP_VERSION'] || '1.0.0'

# Or read from package.json, build number, etc.
```

### 2. Monitor Problematic Releases Daily

**Set up daily alerts**:
```ruby
# In a scheduled job
correlation = ErrorCorrelation.new(days: 1)
problematic = correlation.problematic_releases

if problematic.any?
  AlertService.notify(
    title: "Problematic releases detected",
    releases: problematic.map { |r| "#{r[:version]} (+#{r[:deviation_from_avg]}%)" }
  )
end
```

### 3. Investigate Multi-Error Users

**Weekly review**:
```ruby
# In a scheduled report
multi_error_users = correlation.multi_error_users(min_error_types: 3)
top_10 = multi_error_users.first(10)

CsTeam.send_report(
  title: "Top 10 Most Impacted Users",
  users: top_10
)
```

**Action**:
- Reach out to these users
- Offer support or compensation
- Investigate common patterns

### 4. Use Time Correlation for Root Cause

**When stuck on root cause**:
1. Find all errors correlated with the mystery error
2. Identify the common dependency
3. That's likely your root cause

**Example**:
```ruby
correlated = correlation.time_correlated_errors
mystery_error_corr = correlated.select do |_, data|
  data[:error_type_a] == "MysteryError" || data[:error_type_b] == "MysteryError"
end

# If all correlated errors involve ExternalAPI
# → MysteryError is caused by ExternalAPI issues
```

### 5. Compare Periods After Fixes

**After deploying a fix**:
```ruby
# Wait 24 hours, then check
before_fix = ErrorCorrelation.new(days: 14).period_comparison
after_fix = ErrorCorrelation.new(days: 7).period_comparison

if after_fix[:trend] == :decreasing
  puts "Fix is working! Errors down #{after_fix[:change_percentage]}%"
else
  puts "Fix didn't help. Keep investigating."
end
```

### 6. Document Version History

**Maintain a version log**:
```text
v2.1.0 (2025-12-20):
  - Added new payment flow
  - Errors: 450 total, 12 critical
  - Status: ⚠️  Problematic, hotfix planned

v2.0.5 (2025-12-15):
  - Fixed checkout bug
  - Errors: 120 total, 2 critical
  - Status: ✓ Stable

v2.0.0 (2025-12-01):
  - Major redesign
  - Errors: 800 total, 45 critical
  - Status: ⚠️  Problematic, rolled back
```

## Troubleshooting

### "No version data available"

**Cause**: `app_version` column missing or always null

**Fix**:
```ruby
# Add column
rails g migration add_app_version_to_error_logs app_version:string
rails db:migrate

# Ensure errors include version
RailsErrorDashboard::Commands::LogError.call(
  app_version: ENV['APP_VERSION'],  # Must be set!
  # ...
)
```

### "Multi-error users showing strange results"

**Cause**: User model not found or email missing

**Fix**:
```ruby
# Verify configuration
RailsErrorDashboard.configuration.user_model
# => "User"

# Check if User model has email
User.column_names.include?('email')
# => true

# If using different attribute:
# Update find_user_email method in error_correlation.rb
```

### "Time correlation shows no results"

**Cause**: Not enough error types (need at least 2)

**Fix**:
```ruby
# Check distinct error types
ErrorLog.distinct.pluck(:error_type).count
# If < 2, wait for more data

# Lower correlation threshold to see weaker correlations
# (requires code change to min correlation threshold)
```

### "Period comparison shows unexpected trend"

**Cause**: Time period split doesn't align with deployment

**Fix**: Use custom date ranges instead of automatic split:
```ruby
# Instead of using days parameter (which splits in half)
# Query specific periods:
current = ErrorLog.where("occurred_at >= ?", deployment_date).count
previous = ErrorLog.where("occurred_at < ?", deployment_date).where("occurred_at >= ?", deployment_date - 7.days).count
```

## API Reference

### ErrorCorrelation Query Object

```ruby
correlation = RailsErrorDashboard::Queries::ErrorCorrelation.new(days: 30)

# Release correlation
correlation.errors_by_version
# => { "2.1.0" => { count: 450, error_types: 23, ... }, ... }

correlation.errors_by_git_sha
# => { "a3b4c5d6" => { count: 280, error_types: 15, ... }, ... }

correlation.problematic_releases
# => [{ version: "2.1.0", error_count: 450, deviation_from_avg: 275.0, ... }]

# User correlation
correlation.multi_error_users(min_error_types: 2)
# => [{ user_id: 45, user_email: "...", error_types: [...], total_errors: 23 }]

correlation.error_type_user_overlap("NoMethodError", "ArgumentError")
# => { users_a_count: 150, users_b_count: 120, overlap_count: 80, overlap_percentage: 66.7 }

# Time correlation
correlation.time_correlated_errors
# => { "NoMethodError <-> DatabaseError" => { correlation: 0.85, strength: :strong } }

# Period comparison
correlation.period_comparison
# => { current_period: {...}, previous_period: {...}, change: 650, change_percentage: 54.2, trend: :increasing_significantly }

# Platform-specific errors
correlation.platform_specific_errors
# => { "iOS" => [{ error_type: "...", count: 120, platform_specific: true }] }
```

## Further Reading

- [Advanced Error Grouping Guide](/rails_error_dashboard/docs/features/advanced-error-grouping/) - Finding related errors
- [Baseline Monitoring Guide](/rails_error_dashboard/docs/features/baseline-monitoring/) - Statistical anomaly detection
- [Platform Comparison Guide](/rails_error_dashboard/docs/features/platform-comparison/) - Platform health analysis
- [Occurrence Patterns Guide](/rails_error_dashboard/docs/features/occurrence-patterns/) - Temporal pattern detection

Error Correlation Guide

This guide covers the error correlation analysis features, including release correlation, user correlation, and time-based correlation.

⚙️ Optional Feature - Error correlation is disabled by default. Enable it in your initializer:

RailsErrorDashboard.configure do |config|
  config.enable_error_correlation = true
end

Overview
Release Correlation
User Correlation
Time-Based Correlation
Period Comparison
Use Cases
Configuration
Best Practices

Overview

Error correlation helps you answer critical questions:

Release Impact: Did this version introduce more errors?
User Impact: Are the same users experiencing multiple issues?
Temporal Patterns: Do certain errors occur together?
Trend Analysis: Are errors increasing or decreasing?

Why Correlation Matters

Understanding correlations enables:

Release Quality: Identify problematic releases quickly
User Experience: Find users most impacted by errors
Root Cause Analysis: Discover hidden relationships between errors
Proactive Fixes: Address issues before they escalate

Release Correlation

What is Release Correlation?

Release correlation links errors to specific app versions or git commits, helping you identify which releases introduced errors.

Errors by App Version

Track errors per version to spot problematic releases:

correlation = RailsErrorDashboard::Queries::ErrorCorrelation.new(days: 30)
versions = correlation.errors_by_version

versions.each do |version, data|
  puts "Version: #{version}"
  puts "  Total Errors: #{data[:count]}"
  puts "  Error Types: #{data[:error_types]}"
  puts "  Critical Errors: #{data[:critical_count]}"
  puts "  Platforms: #{data[:platforms].join(', ')}"
  puts "  First Seen: #{data[:first_seen]}"
  puts "  Last Seen: #{data[:last_seen]}"
end

Output:

Version: 2.1.0
  Total Errors: 450
  Error Types: 23
  Critical Errors: 12
  Platforms: iOS, Android
  First Seen: 2025-12-20 09:00:00
  Last Seen: 2025-12-25 14:30:00

Version: 2.0.5
  Total Errors: 120
  Error Types: 8
  Critical Errors: 2
  Platforms: iOS, Android
  First Seen: 2025-12-15 10:00:00
  Last Seen: 2025-12-24 16:00:00

Metrics Explained:

Total Errors: Sum of all error occurrences for this version
Error Types: Number of distinct error types (e.g., NoMethodError, ArgumentError)
Critical Errors: Count of high-severity errors
Platforms: Which platforms are running this version
First/Last Seen: Time range for this version’s errors

Errors by Git SHA

For more granular tracking, correlate errors with specific commits:

shas = correlation.errors_by_git_sha

shas.each do |sha, data|
  puts "Commit: #{sha[0..7]}"
  puts "  Total Errors: #{data[:count]}"
  puts "  Error Types: #{data[:error_types]}"
  puts "  App Versions: #{data[:app_versions].join(', ')}"
end

Use Case: Identify the exact commit that introduced a bug.

Example:

Commit: a3b4c5d6
  Total Errors: 280
  Error Types: 15
  App Versions: 2.1.0, 2.1.1

Commit: e7f8g9h0
  Total Errors: 45
  Error Types: 5
  App Versions: 2.1.0

Interpretation: Commit a3b4c5d6 is associated with significantly more errors → likely contains a bug.

Problematic Releases

Automatically identify releases with abnormally high error rates:

problematic = correlation.problematic_releases

problematic.each do |release|
  puts "⚠️  Version #{release[:version]}"
  puts "   Errors: #{release[:error_count]}"
  puts "   Deviation: +#{release[:deviation_from_avg]}% from average"
  puts "   Critical: #{release[:critical_count]}"
  puts "   Error Types: #{release[:error_types]}"
end

Algorithm:

Calculate average errors per version
Flag versions with >2x the average
Sort by error count (worst first)

Output:

⚠️  Version 2.1.0
   Errors: 450
   Deviation: +275% from average
   Critical: 12
   Error Types: 23

Interpretation: Version 2.1.0 has 275% more errors than average → rollback or hotfix urgently.

Configuration Requirements

To enable release correlation, errors must include version metadata:

# When logging errors
RailsErrorDashboard::Commands::LogError.call(
  error_type: "NoMethodError",
  message: "undefined method 'name'",
  backtrace: [...],
  app_version: "2.1.0",        # Required for version correlation
  git_sha: "a3b4c5d6e7f8",     # Optional, for commit correlation
  # ...
)

Database Columns:

app_version (string) - Semantic version (e.g., “2.1.0”)
git_sha (string) - Full git commit hash

If these columns are missing, correlation queries return empty results.

User Correlation

What is User Correlation?

User correlation identifies users experiencing multiple error types, helping you understand user impact and prioritize fixes.

Multi-Error Users

Find users affected by multiple different error types:

correlation = RailsErrorDashboard::Queries::ErrorCorrelation.new(days: 30)
multi_error_users = correlation.multi_error_users(min_error_types: 2)

multi_error_users.each do |user_data|
  puts "User: #{user_data[:user_email]}"
  puts "  Error Types: #{user_data[:error_type_count]}"
  puts "  Total Errors: #{user_data[:total_errors]}"
  puts "  Types: #{user_data[:error_types].join(', ')}"
end

Output:

User: user@example.com
  Error Types: 5
  Total Errors: 23
  Types: NoMethodError, ArgumentError, NetworkError, TimeoutError, ValidationError

User: another@example.com
  Error Types: 3
  Total Errors: 12
  Types: PaymentError, DatabaseError, CacheError

Interpretation:

user@example.com experienced 5 different error types → severely impacted
They hit 23 total errors → frustrating experience
Reach out to this user, investigate their usage patterns

Parameters:

min_error_types (default: 2) - Minimum distinct error types to include

Error Type User Overlap

Calculate how many users experience two specific error types:

overlap = correlation.error_type_user_overlap(
  "NoMethodError",
  "ArgumentError"
)

puts "Users with NoMethodError: #{overlap[:users_a_count]}"
puts "Users with ArgumentError: #{overlap[:users_b_count]}"
puts "Users with both: #{overlap[:overlap_count]}"
puts "Overlap percentage: #{overlap[:overlap_percentage]}%"
puts "Sample users: #{overlap[:overlapping_user_ids].join(', ')}"

Output:

Users with NoMethodError: 150
Users with ArgumentError: 120
Users with both: 80
Overlap percentage: 66.7%
Sample users: 45, 67, 89, 102, 134, 156, 178, 190, 203, 245

Interpretation:

66.7% of users with ArgumentError also get NoMethodError
High overlap suggests related errors or common user path
Fixing NoMethodError may also reduce ArgumentError

Use Cases:

Identify cascading errors (one error causes another)
Find common user workflows triggering multiple errors
Prioritize fixes with highest user impact

Configuration Requirements

User correlation requires:

user_id column in error_logs table
User model with email attribute (configurable)

# config/initializers/rails_error_dashboard.rb
RailsErrorDashboard.configure do |config|
  config.user_model = "User"  # Your user model class name
end

# When logging errors
RailsErrorDashboard::Commands::LogError.call(
  error_type: "NoMethodError",
  user_id: current_user.id,  # Required for user correlation
  # ...
)

Time-Based Correlation

What is Time-Based Correlation?

Time-based correlation finds error types that occur at similar times of day, suggesting shared root causes or triggers.

Analyzing Time Correlation

correlation = RailsErrorDashboard::Queries::ErrorCorrelation.new(days: 30)
time_correlated = correlation.time_correlated_errors

time_correlated.each do |pair_name, data|
  puts "#{data[:error_type_a]} ↔ #{data[:error_type_b]}"
  puts "  Correlation: #{(data[:correlation] * 100).round}%"
  puts "  Strength: #{data[:strength]}"
end

Output:

NoMethodError ↔ DatabaseError
  Correlation: 85%
  Strength: strong

TimeoutError ↔ NetworkError
  Correlation: 92%
  Strength: strong

PaymentError ↔ ValidationError
  Correlation: 62%
  Strength: moderate

Interpretation:

NoMethodError and DatabaseError occur at similar hours (85% correlation)
Likely shared root cause: database issues trigger NoMethodError
TimeoutError and NetworkError also correlate (92%)
Both probably caused by network infrastructure issues

Correlation Algorithm

Uses Pearson correlation coefficient to measure similarity:

def calculate_time_correlation(series_a, series_b)
  # series_a and series_b are arrays of 24 elements (hourly counts)

  # Calculate means
  mean_a = series_a.sum.to_f / 24
  mean_b = series_b.sum.to_f / 24

  # Calculate covariance and standard deviations
  covariance = 0.0
  std_a = 0.0
  std_b = 0.0

  24.times do |i|
    diff_a = series_a[i] - mean_a
    diff_b = series_b[i] - mean_b
    covariance += diff_a * diff_b
    std_a += diff_a**2
    std_b += diff_b**2
  end

  # Pearson correlation
  correlation = covariance / Math.sqrt(std_a * std_b)
  correlation.round(3)
end

Correlation Values:

1.0 = Perfect positive correlation (always occur together)
0.5-1.0 = Strong correlation (often occur together)
0.0-0.5 = Weak correlation (sometimes occur together)
-1.0-0.0 = Negative correlation (occur at opposite times)

Strength Classification:

Strong: correlation >= 0.8
Moderate: correlation >= 0.5
Weak: correlation < 0.5 (filtered out by default)

Use Cases for Time Correlation

1. Identify Cascading Failures:

DatabaseError ↔ CacheError (90% correlation)
→ Database issues cause cache to fail
→ Fix database to resolve both

2. Find Shared Dependencies:

PaymentError ↔ ExternalAPIError (85% correlation)
→ Both depend on third-party payment API
→ Add retry logic for API timeouts

3. Detect Infrastructure Issues:

All errors spike at 2 AM (high correlation)
→ Nightly backup job causing resource contention
→ Reschedule backup or scale infrastructure

Period Comparison

What is Period Comparison?

Period comparison shows how error rates changed between two time periods, helping identify trends.

Comparing Periods

correlation = RailsErrorDashboard::Queries::ErrorCorrelation.new(days: 30)
comparison = correlation.period_comparison

puts "Current Period (last #{days/2} days):"
puts "  Errors: #{comparison[:current_period][:count]}"
puts "  #{comparison[:current_period][:start]} to #{comparison[:current_period][:end]}"

puts "Previous Period (#{days/2} days before):"
puts "  Errors: #{comparison[:previous_period][:count]}"
puts "  #{comparison[:previous_period][:start]} to #{comparison[:previous_period][:end]}"

puts "Change: #{comparison[:change]} (#{comparison[:change_percentage]}%)"
puts "Trend: #{comparison[:trend]}"

Output:

Current Period (last 15 days):
  Errors: 1850
  2025-12-10 to 2025-12-25

Previous Period (15 days before):
  Errors: 1200
  2025-11-25 to 2025-12-09

Change: +650 (+54.2%)
Trend: increasing_significantly

Trend Classification

Trends are classified based on percentage change:

Trend	Change	Meaning
decreasing_significantly	< -20%	Major improvement
decreasing	-20% to -5%	Moderate improvement
stable	-5% to +5%	No significant change
increasing	+5% to +20%	Moderate concern
increasing_significantly	> +20%	Major concern

Use Cases:

Post-release monitoring: Check if errors increased after deployment
Improvement tracking: Verify that fixes are working
Capacity planning: Identify long-term growth trends

Use Cases

Scenario 1: Post-Release Validation

Question: “Did v2.1.0 introduce new errors?”

Analysis:

correlation = ErrorCorrelation.new(days: 7)
versions = correlation.errors_by_version

v210 = versions["2.1.0"]
v205 = versions["2.0.5"]

puts "v2.1.0: #{v210[:count]} errors, #{v210[:error_types]} types"
puts "v2.0.5: #{v205[:count]} errors, #{v205[:error_types]} types"
puts "Increase: +#{v210[:count] - v205[:count]} errors"

Interpretation:

v2.1.0: 450 errors, 23 types
v2.0.5: 120 errors, 8 types
Increase: +330 errors

Conclusion: v2.1.0 introduced 330 more errors and 15 new error types → rollback or hotfix immediately.

Scenario 2: Identifying Power Users vs Affected Users

Question: “Who should we reach out to?”

Analysis:

multi_error_users = correlation.multi_error_users(min_error_types: 3)

multi_error_users.first(5).each do |user|
  puts "#{user[:user_email]}: #{user[:error_type_count]} types, #{user[:total_errors]} errors"
end

Output:

power_user@example.com: 7 types, 45 errors
affected_user@example.com: 5 types, 23 errors
casual_user@example.com: 3 types, 8 errors

Action:

Reach out to power_user@example.com (most impacted)
Offer compensation or support
Gather feedback on their experience
Investigate their usage patterns

Scenario 3: Root Cause via Time Correlation

Question: “Why do these errors always occur together?”

Analysis:

time_correlated = correlation.time_correlated_errors

# Find errors correlated with DatabaseError
database_correlated = time_correlated.select do |_, data|
  data[:error_type_a] == "DatabaseError" || data[:error_type_b] == "DatabaseError"
end

database_correlated.each do |_, data|
  other_error = data[:error_type_a] == "DatabaseError" ? data[:error_type_b] : data[:error_type_a]
  puts "#{other_error}: #{(data[:correlation] * 100).round}% correlation"
end

Output:

CacheError: 90% correlation
NoMethodError: 85% correlation
TimeoutError: 78% correlation

Conclusion:

DatabaseError triggers CacheError (cache depends on database)
DatabaseError causes NoMethodError (missing data = nil.method)
DatabaseError leads to TimeoutError (retry logic)

Action: Fix DatabaseError to resolve all correlated errors.

Scenario 4: Trend Monitoring

Question: “Are we improving or getting worse?”

Analysis:

# Weekly checks
week1 = ErrorCorrelation.new(days: 7).period_comparison
week2 = ErrorCorrelation.new(days: 14).period_comparison
month = ErrorCorrelation.new(days: 30).period_comparison

puts "This week: #{week1[:trend]} (#{week1[:change_percentage]}%)"
puts "Last 2 weeks: #{week2[:trend]} (#{week2[:change_percentage]}%)"
puts "Last month: #{month[:trend]} (#{month[:change_percentage]}%)"

Output:

This week: decreasing (-12%)
Last 2 weeks: decreasing (-8%)
Last month: stable (+2%)

Interpretation: Recent fixes are working (decreasing trend), but long-term trend is stable.

Scenario 5: A/B Test Impact

Question: “Did the new feature increase errors?”

Analysis:

# Assume feature deployed in v2.1.0
correlation = ErrorCorrelation.new(days: 14)

# Users on new version
v210_errors = correlation.errors_by_version["2.1.0"]

# Users still on old version
v205_errors = correlation.errors_by_version["2.0.5"]

puts "New version (v2.1.0): #{v210_errors[:count]} errors"
puts "Old version (v2.0.5): #{v205_errors[:count]} errors"

Interpretation:

If v2.1.0 has significantly more errors → feature has bugs
If similar → feature is stable
Compare error types to see if new errors introduced

Configuration

Required Columns

For full correlation analysis, ensure these columns exist:

# db/migrate/XXXXXX_add_correlation_columns.rb
class AddCorrelationColumns < ActiveRecord::Migration[7.0]
  def change
    add_column :rails_error_dashboard_error_logs, :app_version, :string
    add_column :rails_error_dashboard_error_logs, :git_sha, :string
    add_column :rails_error_dashboard_error_logs, :user_id, :integer

    add_index :rails_error_dashboard_error_logs, :app_version
    add_index :rails_error_dashboard_error_logs, :git_sha
    add_index :rails_error_dashboard_error_logs, :user_id
  end
end

User Model Configuration

# config/initializers/rails_error_dashboard.rb
RailsErrorDashboard.configure do |config|
  config.user_model = "User"  # Your user model class
end

Logging Errors with Correlation Data

# In your error handler
RailsErrorDashboard::Commands::LogError.call(
  error_type: error.class.name,
  message: error.message,
  backtrace: error.backtrace,

  # Release correlation
  app_version: ENV['APP_VERSION'] || "unknown",
  git_sha: ENV['GIT_SHA'] || `git rev-parse HEAD`.strip,

  # User correlation
  user_id: current_user&.id,

  # Other metadata
  platform: request.user_agent =~ /iPhone/ ? "iOS" : "Android",
  # ...
)

Best Practices

1. Track App Version Consistently

Use semantic versioning:

✓ Good: "2.1.0", "2.1.1", "2.2.0"
✗ Bad: "v2.1", "2.1-beta", "latest"

Set in environment:

# config/application.rb
config.app_version = ENV['APP_VERSION'] || '1.0.0'

# Or read from package.json, build number, etc.

2. Monitor Problematic Releases Daily

Set up daily alerts:

# In a scheduled job
correlation = ErrorCorrelation.new(days: 1)
problematic = correlation.problematic_releases

if problematic.any?
  AlertService.notify(
    title: "Problematic releases detected",
    releases: problematic.map { |r| "#{r[:version]} (+#{r[:deviation_from_avg]}%)" }
  )
end

3. Investigate Multi-Error Users

Weekly review:

# In a scheduled report
multi_error_users = correlation.multi_error_users(min_error_types: 3)
top_10 = multi_error_users.first(10)

CsTeam.send_report(
  title: "Top 10 Most Impacted Users",
  users: top_10
)

Action:

Reach out to these users
Offer support or compensation
Investigate common patterns

4. Use Time Correlation for Root Cause

When stuck on root cause:

Find all errors correlated with the mystery error
Identify the common dependency
That’s likely your root cause

Example:

correlated = correlation.time_correlated_errors
mystery_error_corr = correlated.select do |_, data|
  data[:error_type_a] == "MysteryError" || data[:error_type_b] == "MysteryError"
end

# If all correlated errors involve ExternalAPI
# → MysteryError is caused by ExternalAPI issues

5. Compare Periods After Fixes

After deploying a fix:

# Wait 24 hours, then check
before_fix = ErrorCorrelation.new(days: 14).period_comparison
after_fix = ErrorCorrelation.new(days: 7).period_comparison

if after_fix[:trend] == :decreasing
  puts "Fix is working! Errors down #{after_fix[:change_percentage]}%"
else
  puts "Fix didn't help. Keep investigating."
end

6. Document Version History

Maintain a version log:

v2.1.0 (2025-12-20):
  - Added new payment flow
  - Errors: 450 total, 12 critical
  - Status: ⚠️  Problematic, hotfix planned

v2.0.5 (2025-12-15):
  - Fixed checkout bug
  - Errors: 120 total, 2 critical
  - Status: ✓ Stable

v2.0.0 (2025-12-01):
  - Major redesign
  - Errors: 800 total, 45 critical
  - Status: ⚠️  Problematic, rolled back

Troubleshooting

“No version data available”

Cause: app_version column missing or always null

Fix:

# Add column
rails g migration add_app_version_to_error_logs app_version:string
rails db:migrate

# Ensure errors include version
RailsErrorDashboard::Commands::LogError.call(
  app_version: ENV['APP_VERSION'],  # Must be set!
  # ...
)

“Multi-error users showing strange results”

Cause: User model not found or email missing

Fix:

# Verify configuration
RailsErrorDashboard.configuration.user_model
# => "User"

# Check if User model has email
User.column_names.include?('email')
# => true

# If using different attribute:
# Update find_user_email method in error_correlation.rb

“Time correlation shows no results”

Cause: Not enough error types (need at least 2)

Fix:

# Check distinct error types
ErrorLog.distinct.pluck(:error_type).count
# If < 2, wait for more data

# Lower correlation threshold to see weaker correlations
# (requires code change to min correlation threshold)

“Period comparison shows unexpected trend”

Cause: Time period split doesn’t align with deployment

Fix: Use custom date ranges instead of automatic split:

# Instead of using days parameter (which splits in half)
# Query specific periods:
current = ErrorLog.where("occurred_at >= ?", deployment_date).count
previous = ErrorLog.where("occurred_at < ?", deployment_date).where("occurred_at >= ?", deployment_date - 7.days).count

API Reference

ErrorCorrelation Query Object

correlation = RailsErrorDashboard::Queries::ErrorCorrelation.new(days: 30)

# Release correlation
correlation.errors_by_version
# => { "2.1.0" => { count: 450, error_types: 23, ... }, ... }

correlation.errors_by_git_sha
# => { "a3b4c5d6" => { count: 280, error_types: 15, ... }, ... }

correlation.problematic_releases
# => [{ version: "2.1.0", error_count: 450, deviation_from_avg: 275.0, ... }]

# User correlation
correlation.multi_error_users(min_error_types: 2)
# => [{ user_id: 45, user_email: "...", error_types: [...], total_errors: 23 }]

correlation.error_type_user_overlap("NoMethodError", "ArgumentError")
# => { users_a_count: 150, users_b_count: 120, overlap_count: 80, overlap_percentage: 66.7 }

# Time correlation
correlation.time_correlated_errors
# => { "NoMethodError <-> DatabaseError" => { correlation: 0.85, strength: :strong } }

# Period comparison
correlation.period_comparison
# => { current_period: {...}, previous_period: {...}, change: 650, change_percentage: 54.2, trend: :increasing_significantly }

# Platform-specific errors
correlation.platform_specific_errors
# => { "iOS" => [{ error_type: "...", count: 120, platform_specific: true }] }