discourse/app/jobs
Alan Guo Xiang Tan 2cc3ac34e7
PERF: Daily rollups for top countries and top referrers reports (#40285)
Pre-aggregates the top countries and top referrers admin-dashboard
reports into two daily rollup tables. Both cards currently scan
`browser_pageview_events` and `GROUP BY` the relevant dimension on every
cold cache miss. At a few million events across 90 days that takes
hundreds of milliseconds, and on larger instances it crosses into
multi-second territory and can time out.

## New tables

### `browser_pageview_country_daily_rollups`

| Column | Type | Constraint |
|---|---|---|
| `id` | `bigint` | primary key |
| `date` | `date` | not null |
| `country_code` | `varchar(2)` | nullable (preserves NULL-country rows
for the denominator) |
| `count` | `bigint` | not null, total pageviews |
| `logged_in_count` | `bigint` | not null, pageviews where `user_id IS
NOT NULL` |

Unique index on `(date, country_code) NULLS NOT DISTINCT` for `ON
CONFLICT` idempotency including the null-country row.

### `browser_pageview_referrer_daily_rollups`

| Column | Type | Constraint |
|---|---|---|
| `id` | `bigint` | primary key |
| `date` | `date` | not null |
| `normalized_referrer` | `varchar(2000)` | nullable (direct visits) |
| `count` | `bigint` | not null |
| `logged_in_count` | `bigint` | not null |

Unique index on `(date, normalized_referrer) NULLS NOT DISTINCT`. Stores
every `normalized_referrer` value, including internal-host rows and
direct visits.

## Key technical decisions

- **Two count columns per row: `count` (total) and `logged_in_count`
(`user_id IS NOT NULL`).** The `login_required` site setting can be
toggled at any time. Selecting one or the other at read time produces
equivalent numbers without re-aggregating, matching the previous `AND
user_id IS NOT NULL` filter the reports applied at query time.

- **All referrers are stored in the rollup; internal-host exclusion is
applied at read time.** Aggregating away internal-navigation rows would
make the rollup lossy: a hostname change, misconfigured host, or future
change to the matching rule would silently corrupt historical reports
with no way to rebuild. Storing them keeps the rollup reversible at the
cost of a few extra rows per day. The read query applies the same-host
exclusion against the current `Discourse.current_hostname` so report
semantics match the previous event-scan behavior.

- **Scheduled job re-aggregates yesterday and today every 30 minutes.**
Cadence sits under the existing 35-minute report cache TTL. After each
run the job calls `Report.clear_cache` for both report types so any
previously-cached payload (including the empty payload an admin may have
loaded before the first run) is evicted, and the next dashboard load
picks up the fresh aggregate.

- **First scheduled run backfills automatically when rollups are
empty.** The job aggregates from the earliest available browser pageview
event through today, then switches to the incremental
yesterday-and-today window on later runs. No manual action required on
deploy.

## Benchmark

Local dataset: 3,000,000 `browser_pageview_events` rows across 90 days.
Default 30-day window, `LIMIT 5`.

### top_countries

**OLD query (event scan), `Execution Time: 391.752 ms`**

```sql
WITH ranked AS (
  SELECT
    country_code,
    COUNT(*) AS count,
    SUM(COUNT(*)) OVER () AS total
  FROM browser_pageview_events
  WHERE created_at >= NOW() - INTERVAL '90 days'
    AND created_at < NOW() + INTERVAL '1 day'
  GROUP BY country_code
)
SELECT country_code, count,
       CASE WHEN total = 0 THEN 0
            ELSE ROUND((count::numeric / total) * 100)::integer END AS percent
FROM ranked
WHERE country_code IS NOT NULL
  AND country_code NOT IN ('ZZ','T1','A1','A2','O1','XX','EU','AP')
ORDER BY count DESC, country_code ASC
LIMIT 5;
```

```
Limit  (cost=102139.99..102139.99 rows=2 width=15) (actual time=378.507..382.429 rows=5 loops=1)
  ->  Sort  (cost=102139.99..102139.99 rows=2 width=15) (actual time=373.538..377.460 rows=5 loops=1)
        Sort Key: ranked.count DESC, ranked.country_code
        Sort Method: quicksort  Memory: 25kB
        ->  Subquery Scan on ranked  (cost=102137.09..102139.98 rows=2 width=15) (actual time=373.521..377.447 rows=10 loops=1)
              Filter: ((ranked.country_code IS NOT NULL) AND ((ranked.country_code)::text <> ALL ('{ZZ,T1,A1,A2,O1,XX,EU,AP}'::text[])))
              Rows Removed by Filter: 1
              ->  WindowAgg  (cost=102137.09..102139.75 rows=10 width=43) (actual time=373.514..377.437 rows=11 loops=1)
                    ->  Finalize GroupAggregate  (cost=102137.09..102139.62 rows=10 width=11) (actual time=373.501..377.426 rows=11 loops=1)
                          Group Key: browser_pageview_events.country_code
                          ->  Gather Merge  (cost=102137.09..102139.42 rows=20 width=11) (actual time=373.493..377.417 rows=33 loops=1)
                                Workers Planned: 2
                                Workers Launched: 2
                                ->  Sort  (cost=101137.07..101137.09 rows=10 width=11) (actual time=366.783..366.784 rows=11 loops=3)
                                      Sort Key: browser_pageview_events.country_code
                                      Sort Method: quicksort  Memory: 25kB
                                      ->  Partial HashAggregate  (cost=101136.80..101136.90 rows=10 width=11) (actual time=366.768..366.768 rows=11 loops=3)
                                            Group Key: browser_pageview_events.country_code
                                            ->  Parallel Seq Scan on browser_pageview_events  (cost=0.00..94886.64 rows=1250033 width=3) (actual time=2.349..294.932 rows=999978 loops=3)
                                                  Filter: ((created_at >= (now() - '90 days'::interval)) AND (created_at < (now() + '1 day'::interval)))
                                                  Rows Removed by Filter: 22
 Planning Time: 0.446 ms
 JIT:
   Functions: 31
   Timing: Generation 0.887 ms, Inlining 0.000 ms, Optimization 0.572 ms, Emission 11.441 ms, Total 12.900 ms
 Execution Time: 391.752 ms
```

**NEW query (rollup), `Execution Time: 0.408 ms`**

```sql
WITH ranked AS (
  SELECT
    country_code,
    SUM(count) AS count,
    SUM(SUM(count)) OVER () AS total
  FROM browser_pageview_country_daily_rollups
  WHERE date >= (NOW() - INTERVAL '90 days')::date
    AND date < (NOW() + INTERVAL '1 day')::date
  GROUP BY country_code
  HAVING SUM(count) > 0
)
SELECT country_code, count,
       CASE WHEN total = 0 THEN 0
            ELSE ROUND((count::numeric / total) * 100)::integer END AS percent
FROM ranked
WHERE country_code IS NOT NULL
  AND country_code NOT IN ('ZZ','T1','A1','A2','O1','XX','EU','AP')
ORDER BY count DESC, country_code ASC
LIMIT 5;
```

```
Limit  (cost=43.25..43.26 rows=1 width=39) (actual time=0.352..0.353 rows=5 loops=1)
  ->  Sort  (cost=43.25..43.26 rows=1 width=39) (actual time=0.351..0.352 rows=5 loops=1)
        Sort Key: ranked.count DESC, ranked.country_code
        Sort Method: quicksort  Memory: 25kB
        ->  Subquery Scan on ranked  (cost=42.98..43.24 rows=1 width=39) (actual time=0.330..0.334 rows=10 loops=1)
              Filter: ((ranked.country_code IS NOT NULL) AND ((ranked.country_code)::text <> ALL ('{ZZ,T1,A1,A2,O1,XX,EU,AP}'::text[])))
              Rows Removed by Filter: 1
              ->  WindowAgg  (cost=42.98..43.17 rows=3 width=67) (actual time=0.326..0.327 rows=11 loops=1)
                    ->  HashAggregate  (cost=42.98..43.13 rows=3 width=35) (actual time=0.311..0.313 rows=11 loops=1)
                          Group Key: browser_pageview_country_daily_rollups.country_code
                          Filter: (sum(browser_pageview_country_daily_rollups.count) > '0'::numeric)
                          Batches: 1  Memory Usage: 24kB
                          ->  Seq Scan on browser_pageview_country_daily_rollups  (cost=0.00..38.03 rows=990 width=11) (actual time=0.007..0.237 rows=990 loops=1)
                                Filter: ((date >= ((now() - '90 days'::interval))::date) AND (date < ((now() + '1 day'::interval))::date))
                                Rows Removed by Filter: 11
 Planning Time: 0.446 ms
 Execution Time: 0.408 ms
```

### top_referrers

**OLD query (event scan), `Execution Time: 361.565 ms`**

```sql
WITH ranked AS (
  SELECT
    normalized_referrer,
    COUNT(*) AS count,
    SUM(COUNT(*)) OVER () AS total
  FROM browser_pageview_events
  WHERE created_at >= NOW() - INTERVAL '90 days'
    AND created_at < NOW() + INTERVAL '1 day'
    AND (
      normalized_referrer IS NULL
      OR (
        normalized_referrer <> 'forum.example.com'
        AND normalized_referrer NOT LIKE 'forum.example.com/%' ESCAPE '\'
        AND normalized_referrer NOT LIKE 'forum.example.com?%' ESCAPE '\'
      )
    )
  GROUP BY normalized_referrer
)
SELECT normalized_referrer, count,
       CASE WHEN total = 0 THEN 0
            ELSE ROUND((count::numeric / total) * 100)::integer END AS percent
FROM ranked
WHERE normalized_referrer IS NOT NULL
ORDER BY count DESC, normalized_referrer ASC
LIMIT 5;
```

```
Limit  (cost=108521.27..108521.28 rows=5 width=32) (actual time=356.291..361.113 rows=5 loops=1)
  ->  Sort  (cost=108521.27..108521.53 rows=106 width=32) (actual time=351.128..355.949 rows=5 loops=1)
        Sort Key: ranked.count DESC, ranked.normalized_referrer
        Sort Method: quicksort  Memory: 25kB
        ->  Subquery Scan on ranked  (cost=108488.40..108519.51 rows=106 width=32) (actual time=351.121..355.944 rows=6 loops=1)
              Filter: (ranked.normalized_referrer IS NOT NULL)
              Rows Removed by Filter: 1
              ->  WindowAgg  (cost=108488.40..108516.85 rows=107 width=60) (actual time=351.117..355.939 rows=7 loops=1)
                    ->  Finalize GroupAggregate  (cost=108488.40..108515.51 rows=107 width=28) (actual time=351.107..355.931 rows=7 loops=1)
                          Group Key: browser_pageview_events.normalized_referrer
                          ->  Gather Merge  (cost=108488.40..108513.37 rows=214 width=28) (actual time=351.102..355.925 rows=21 loops=1)
                                Workers Planned: 2
                                Workers Launched: 2
                                ->  Sort  (cost=107488.38..107488.65 rows=107 width=28) (actual time=341.058..341.059 rows=7 loops=3)
                                      Sort Key: browser_pageview_events.normalized_referrer
                                      ->  Partial HashAggregate  (cost=107483.70..107484.77 rows=107 width=28) (actual time=341.044..341.045 rows=7 loops=3)
                                            Group Key: browser_pageview_events.normalized_referrer
                                            ->  Parallel Seq Scan on browser_pageview_events  (cost=0.00..104262.13 rows=644315 width=20) (actual time=2.982..281.076 rows=846138 loops=3)
                                                  Filter: ((created_at >= (now() - '90 days'::interval)) AND (created_at < (now() + '1 day'::interval)) AND ((normalized_referrer IS NULL) OR (((normalized_referrer)::text <> 'forum.example.com'::text) AND ((normalized_referrer)::text !~~ 'forum.example.com/%'::text) AND ((normalized_referrer)::text !~~ 'forum.example.com?%'::text))))
                                                  Rows Removed by Filter: 153862
 Planning Time: 0.134 ms
 JIT:
   Functions: 31
   Timing: Generation 1.277 ms, Inlining 0.000 ms, Optimization 0.674 ms, Emission 13.385 ms, Total 15.336 ms
 Execution Time: 361.565 ms
```

**NEW query (rollup), `Execution Time: 1.131 ms`**

```sql
WITH ranked AS (
  SELECT
    normalized_referrer,
    SUM(count) AS count,
    SUM(SUM(count)) OVER () AS total
  FROM browser_pageview_referrer_daily_rollups
  WHERE date >= (NOW() - INTERVAL '90 days')::date
    AND date < (NOW() + INTERVAL '1 day')::date
    AND (
      normalized_referrer IS NULL
      OR (
        normalized_referrer <> 'forum.example.com'
        AND normalized_referrer NOT LIKE 'forum.example.com/%' ESCAPE '\'
        AND normalized_referrer NOT LIKE 'forum.example.com?%' ESCAPE '\'
      )
    )
  GROUP BY normalized_referrer
  HAVING SUM(count) > 0
)
SELECT normalized_referrer, count,
       CASE WHEN total = 0 THEN 0
            ELSE ROUND((count::numeric / total) * 100)::integer END AS percent
FROM ranked
WHERE normalized_referrer IS NOT NULL
ORDER BY count DESC, normalized_referrer ASC
LIMIT 5;
```

```
Limit  (cost=486.29..486.31 rows=5 width=64) (actual time=0.984..0.985 rows=5 loops=1)
  ->  Sort  (cost=486.29..486.38 rows=36 width=64) (actual time=0.984..0.984 rows=5 loops=1)
        Sort Key: ranked.count DESC, ranked.normalized_referrer
        Sort Method: quicksort  Memory: 25kB
        ->  Subquery Scan on ranked  (cost=482.83..485.69 rows=36 width=64) (actual time=0.928..0.930 rows=6 loops=1)
              Filter: (ranked.normalized_referrer IS NOT NULL)
              Rows Removed by Filter: 1
              ->  WindowAgg  (cost=482.83..484.88 rows=36 width=92) (actual time=0.916..0.917 rows=7 loops=1)
                    ->  HashAggregate  (cost=482.83..484.43 rows=36 width=60) (actual time=0.887..0.888 rows=7 loops=1)
                          Group Key: browser_pageview_referrer_daily_rollups.normalized_referrer
                          Filter: (sum(browser_pageview_referrer_daily_rollups.count) > '0'::numeric)
                          Batches: 1  Memory Usage: 24kB
                          ->  Seq Scan on browser_pageview_referrer_daily_rollups  (cost=0.00..478.55 rows=856 width=36) (actual time=0.005..0.835 rows=630 loops=1)
                                Filter: (((normalized_referrer IS NULL) OR (((normalized_referrer)::text <> 'forum.example.com'::text) AND ((normalized_referrer)::text !~~ 'forum.example.com/%'::text) AND ((normalized_referrer)::text !~~ 'forum.example.com?%'::text))) AND (date >= ((now() - '90 days'::interval))::date) AND (date < ((now() + '1 day'::interval))::date))
                                Rows Removed by Filter: 9198
 Planning Time: 0.544 ms
 Execution Time: 1.131 ms
```

~960x speedup on top_countries (391.752 ms → 0.408 ms) and ~320x on
top_referrers (361.565 ms → 1.131 ms). The new path's query plan does
not touch `browser_pageview_events` at all, so its latency is
structurally independent of event volume.
2026-05-26 14:35:20 +08:00
..
concerns DEV: Enable Style/RedundantParentheses rubocop rule (#40095) 2026-05-19 15:48:09 +02:00
onceoff DEV: Enable Style/RedundantSelf rubocop rule (#40098) 2026-05-19 19:27:45 +02:00
regular FIX: Move quoted post rebaking on avatar upload to a background job (#40231) 2026-05-22 13:27:52 -03:00
scheduled PERF: Daily rollups for top countries and top referrers reports (#40285) 2026-05-26 14:35:20 +08:00
base.rb DEV: Enable Style/RedundantSelf rubocop rule (#40098) 2026-05-19 19:27:45 +02:00