mirror of
https://gh.wpcy.net/https://github.com/discourse/discourse.git
synced 2026-05-27 06:56:35 +08:00
Pre-aggregates the top countries and top referrers admin-dashboard
reports into two daily rollup tables. Both cards currently scan
`browser_pageview_events` and `GROUP BY` the relevant dimension on every
cold cache miss. At a few million events across 90 days that takes
hundreds of milliseconds, and on larger instances it crosses into
multi-second territory and can time out.
## New tables
### `browser_pageview_country_daily_rollups`
| Column | Type | Constraint |
|---|---|---|
| `id` | `bigint` | primary key |
| `date` | `date` | not null |
| `country_code` | `varchar(2)` | nullable (preserves NULL-country rows
for the denominator) |
| `count` | `bigint` | not null, total pageviews |
| `logged_in_count` | `bigint` | not null, pageviews where `user_id IS
NOT NULL` |
Unique index on `(date, country_code) NULLS NOT DISTINCT` for `ON
CONFLICT` idempotency including the null-country row.
### `browser_pageview_referrer_daily_rollups`
| Column | Type | Constraint |
|---|---|---|
| `id` | `bigint` | primary key |
| `date` | `date` | not null |
| `normalized_referrer` | `varchar(2000)` | nullable (direct visits) |
| `count` | `bigint` | not null |
| `logged_in_count` | `bigint` | not null |
Unique index on `(date, normalized_referrer) NULLS NOT DISTINCT`. Stores
every `normalized_referrer` value, including internal-host rows and
direct visits.
## Key technical decisions
- **Two count columns per row: `count` (total) and `logged_in_count`
(`user_id IS NOT NULL`).** The `login_required` site setting can be
toggled at any time. Selecting one or the other at read time produces
equivalent numbers without re-aggregating, matching the previous `AND
user_id IS NOT NULL` filter the reports applied at query time.
- **All referrers are stored in the rollup; internal-host exclusion is
applied at read time.** Aggregating away internal-navigation rows would
make the rollup lossy: a hostname change, misconfigured host, or future
change to the matching rule would silently corrupt historical reports
with no way to rebuild. Storing them keeps the rollup reversible at the
cost of a few extra rows per day. The read query applies the same-host
exclusion against the current `Discourse.current_hostname` so report
semantics match the previous event-scan behavior.
- **Scheduled job re-aggregates yesterday and today every 30 minutes.**
Cadence sits under the existing 35-minute report cache TTL. After each
run the job calls `Report.clear_cache` for both report types so any
previously-cached payload (including the empty payload an admin may have
loaded before the first run) is evicted, and the next dashboard load
picks up the fresh aggregate.
- **First scheduled run backfills automatically when rollups are
empty.** The job aggregates from the earliest available browser pageview
event through today, then switches to the incremental
yesterday-and-today window on later runs. No manual action required on
deploy.
## Benchmark
Local dataset: 3,000,000 `browser_pageview_events` rows across 90 days.
Default 30-day window, `LIMIT 5`.
### top_countries
**OLD query (event scan), `Execution Time: 391.752 ms`**
```sql
WITH ranked AS (
SELECT
country_code,
COUNT(*) AS count,
SUM(COUNT(*)) OVER () AS total
FROM browser_pageview_events
WHERE created_at >= NOW() - INTERVAL '90 days'
AND created_at < NOW() + INTERVAL '1 day'
GROUP BY country_code
)
SELECT country_code, count,
CASE WHEN total = 0 THEN 0
ELSE ROUND((count::numeric / total) * 100)::integer END AS percent
FROM ranked
WHERE country_code IS NOT NULL
AND country_code NOT IN ('ZZ','T1','A1','A2','O1','XX','EU','AP')
ORDER BY count DESC, country_code ASC
LIMIT 5;
```
```
Limit (cost=102139.99..102139.99 rows=2 width=15) (actual time=378.507..382.429 rows=5 loops=1)
-> Sort (cost=102139.99..102139.99 rows=2 width=15) (actual time=373.538..377.460 rows=5 loops=1)
Sort Key: ranked.count DESC, ranked.country_code
Sort Method: quicksort Memory: 25kB
-> Subquery Scan on ranked (cost=102137.09..102139.98 rows=2 width=15) (actual time=373.521..377.447 rows=10 loops=1)
Filter: ((ranked.country_code IS NOT NULL) AND ((ranked.country_code)::text <> ALL ('{ZZ,T1,A1,A2,O1,XX,EU,AP}'::text[])))
Rows Removed by Filter: 1
-> WindowAgg (cost=102137.09..102139.75 rows=10 width=43) (actual time=373.514..377.437 rows=11 loops=1)
-> Finalize GroupAggregate (cost=102137.09..102139.62 rows=10 width=11) (actual time=373.501..377.426 rows=11 loops=1)
Group Key: browser_pageview_events.country_code
-> Gather Merge (cost=102137.09..102139.42 rows=20 width=11) (actual time=373.493..377.417 rows=33 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=101137.07..101137.09 rows=10 width=11) (actual time=366.783..366.784 rows=11 loops=3)
Sort Key: browser_pageview_events.country_code
Sort Method: quicksort Memory: 25kB
-> Partial HashAggregate (cost=101136.80..101136.90 rows=10 width=11) (actual time=366.768..366.768 rows=11 loops=3)
Group Key: browser_pageview_events.country_code
-> Parallel Seq Scan on browser_pageview_events (cost=0.00..94886.64 rows=1250033 width=3) (actual time=2.349..294.932 rows=999978 loops=3)
Filter: ((created_at >= (now() - '90 days'::interval)) AND (created_at < (now() + '1 day'::interval)))
Rows Removed by Filter: 22
Planning Time: 0.446 ms
JIT:
Functions: 31
Timing: Generation 0.887 ms, Inlining 0.000 ms, Optimization 0.572 ms, Emission 11.441 ms, Total 12.900 ms
Execution Time: 391.752 ms
```
**NEW query (rollup), `Execution Time: 0.408 ms`**
```sql
WITH ranked AS (
SELECT
country_code,
SUM(count) AS count,
SUM(SUM(count)) OVER () AS total
FROM browser_pageview_country_daily_rollups
WHERE date >= (NOW() - INTERVAL '90 days')::date
AND date < (NOW() + INTERVAL '1 day')::date
GROUP BY country_code
HAVING SUM(count) > 0
)
SELECT country_code, count,
CASE WHEN total = 0 THEN 0
ELSE ROUND((count::numeric / total) * 100)::integer END AS percent
FROM ranked
WHERE country_code IS NOT NULL
AND country_code NOT IN ('ZZ','T1','A1','A2','O1','XX','EU','AP')
ORDER BY count DESC, country_code ASC
LIMIT 5;
```
```
Limit (cost=43.25..43.26 rows=1 width=39) (actual time=0.352..0.353 rows=5 loops=1)
-> Sort (cost=43.25..43.26 rows=1 width=39) (actual time=0.351..0.352 rows=5 loops=1)
Sort Key: ranked.count DESC, ranked.country_code
Sort Method: quicksort Memory: 25kB
-> Subquery Scan on ranked (cost=42.98..43.24 rows=1 width=39) (actual time=0.330..0.334 rows=10 loops=1)
Filter: ((ranked.country_code IS NOT NULL) AND ((ranked.country_code)::text <> ALL ('{ZZ,T1,A1,A2,O1,XX,EU,AP}'::text[])))
Rows Removed by Filter: 1
-> WindowAgg (cost=42.98..43.17 rows=3 width=67) (actual time=0.326..0.327 rows=11 loops=1)
-> HashAggregate (cost=42.98..43.13 rows=3 width=35) (actual time=0.311..0.313 rows=11 loops=1)
Group Key: browser_pageview_country_daily_rollups.country_code
Filter: (sum(browser_pageview_country_daily_rollups.count) > '0'::numeric)
Batches: 1 Memory Usage: 24kB
-> Seq Scan on browser_pageview_country_daily_rollups (cost=0.00..38.03 rows=990 width=11) (actual time=0.007..0.237 rows=990 loops=1)
Filter: ((date >= ((now() - '90 days'::interval))::date) AND (date < ((now() + '1 day'::interval))::date))
Rows Removed by Filter: 11
Planning Time: 0.446 ms
Execution Time: 0.408 ms
```
### top_referrers
**OLD query (event scan), `Execution Time: 361.565 ms`**
```sql
WITH ranked AS (
SELECT
normalized_referrer,
COUNT(*) AS count,
SUM(COUNT(*)) OVER () AS total
FROM browser_pageview_events
WHERE created_at >= NOW() - INTERVAL '90 days'
AND created_at < NOW() + INTERVAL '1 day'
AND (
normalized_referrer IS NULL
OR (
normalized_referrer <> 'forum.example.com'
AND normalized_referrer NOT LIKE 'forum.example.com/%' ESCAPE '\'
AND normalized_referrer NOT LIKE 'forum.example.com?%' ESCAPE '\'
)
)
GROUP BY normalized_referrer
)
SELECT normalized_referrer, count,
CASE WHEN total = 0 THEN 0
ELSE ROUND((count::numeric / total) * 100)::integer END AS percent
FROM ranked
WHERE normalized_referrer IS NOT NULL
ORDER BY count DESC, normalized_referrer ASC
LIMIT 5;
```
```
Limit (cost=108521.27..108521.28 rows=5 width=32) (actual time=356.291..361.113 rows=5 loops=1)
-> Sort (cost=108521.27..108521.53 rows=106 width=32) (actual time=351.128..355.949 rows=5 loops=1)
Sort Key: ranked.count DESC, ranked.normalized_referrer
Sort Method: quicksort Memory: 25kB
-> Subquery Scan on ranked (cost=108488.40..108519.51 rows=106 width=32) (actual time=351.121..355.944 rows=6 loops=1)
Filter: (ranked.normalized_referrer IS NOT NULL)
Rows Removed by Filter: 1
-> WindowAgg (cost=108488.40..108516.85 rows=107 width=60) (actual time=351.117..355.939 rows=7 loops=1)
-> Finalize GroupAggregate (cost=108488.40..108515.51 rows=107 width=28) (actual time=351.107..355.931 rows=7 loops=1)
Group Key: browser_pageview_events.normalized_referrer
-> Gather Merge (cost=108488.40..108513.37 rows=214 width=28) (actual time=351.102..355.925 rows=21 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=107488.38..107488.65 rows=107 width=28) (actual time=341.058..341.059 rows=7 loops=3)
Sort Key: browser_pageview_events.normalized_referrer
-> Partial HashAggregate (cost=107483.70..107484.77 rows=107 width=28) (actual time=341.044..341.045 rows=7 loops=3)
Group Key: browser_pageview_events.normalized_referrer
-> Parallel Seq Scan on browser_pageview_events (cost=0.00..104262.13 rows=644315 width=20) (actual time=2.982..281.076 rows=846138 loops=3)
Filter: ((created_at >= (now() - '90 days'::interval)) AND (created_at < (now() + '1 day'::interval)) AND ((normalized_referrer IS NULL) OR (((normalized_referrer)::text <> 'forum.example.com'::text) AND ((normalized_referrer)::text !~~ 'forum.example.com/%'::text) AND ((normalized_referrer)::text !~~ 'forum.example.com?%'::text))))
Rows Removed by Filter: 153862
Planning Time: 0.134 ms
JIT:
Functions: 31
Timing: Generation 1.277 ms, Inlining 0.000 ms, Optimization 0.674 ms, Emission 13.385 ms, Total 15.336 ms
Execution Time: 361.565 ms
```
**NEW query (rollup), `Execution Time: 1.131 ms`**
```sql
WITH ranked AS (
SELECT
normalized_referrer,
SUM(count) AS count,
SUM(SUM(count)) OVER () AS total
FROM browser_pageview_referrer_daily_rollups
WHERE date >= (NOW() - INTERVAL '90 days')::date
AND date < (NOW() + INTERVAL '1 day')::date
AND (
normalized_referrer IS NULL
OR (
normalized_referrer <> 'forum.example.com'
AND normalized_referrer NOT LIKE 'forum.example.com/%' ESCAPE '\'
AND normalized_referrer NOT LIKE 'forum.example.com?%' ESCAPE '\'
)
)
GROUP BY normalized_referrer
HAVING SUM(count) > 0
)
SELECT normalized_referrer, count,
CASE WHEN total = 0 THEN 0
ELSE ROUND((count::numeric / total) * 100)::integer END AS percent
FROM ranked
WHERE normalized_referrer IS NOT NULL
ORDER BY count DESC, normalized_referrer ASC
LIMIT 5;
```
```
Limit (cost=486.29..486.31 rows=5 width=64) (actual time=0.984..0.985 rows=5 loops=1)
-> Sort (cost=486.29..486.38 rows=36 width=64) (actual time=0.984..0.984 rows=5 loops=1)
Sort Key: ranked.count DESC, ranked.normalized_referrer
Sort Method: quicksort Memory: 25kB
-> Subquery Scan on ranked (cost=482.83..485.69 rows=36 width=64) (actual time=0.928..0.930 rows=6 loops=1)
Filter: (ranked.normalized_referrer IS NOT NULL)
Rows Removed by Filter: 1
-> WindowAgg (cost=482.83..484.88 rows=36 width=92) (actual time=0.916..0.917 rows=7 loops=1)
-> HashAggregate (cost=482.83..484.43 rows=36 width=60) (actual time=0.887..0.888 rows=7 loops=1)
Group Key: browser_pageview_referrer_daily_rollups.normalized_referrer
Filter: (sum(browser_pageview_referrer_daily_rollups.count) > '0'::numeric)
Batches: 1 Memory Usage: 24kB
-> Seq Scan on browser_pageview_referrer_daily_rollups (cost=0.00..478.55 rows=856 width=36) (actual time=0.005..0.835 rows=630 loops=1)
Filter: (((normalized_referrer IS NULL) OR (((normalized_referrer)::text <> 'forum.example.com'::text) AND ((normalized_referrer)::text !~~ 'forum.example.com/%'::text) AND ((normalized_referrer)::text !~~ 'forum.example.com?%'::text))) AND (date >= ((now() - '90 days'::interval))::date) AND (date < ((now() + '1 day'::interval))::date))
Rows Removed by Filter: 9198
Planning Time: 0.544 ms
Execution Time: 1.131 ms
```
~960x speedup on top_countries (391.752 ms → 0.408 ms) and ~320x on
top_referrers (361.565 ms → 1.131 ms). The new path's query plan does
not touch `browser_pageview_events` at all, so its latency is
structurally independent of event volume.
|
||
|---|---|---|
| .. | ||
| concerns | ||
| onceoff | ||
| regular | ||
| scheduled | ||
| base.rb | ||