mirror of
https://gh.wpcy.net/https://github.com/discourse/discourse.git
synced 2026-04-29 14:46:43 +08:00
The `emojiReplacementRegex` in `pretty-text/emoji.js` was a manually maintained regex string copied from an external source (mathiasbynens/emoji-test-regex-pattern). This created a maintenance gap: when the `discourse-emojis` gem was updated with new Unicode emoji (e.g. Unicode 17.0), the replacements map would include them but the regex would not match their raw Unicode characters. This meant pasting a newer emoji like (distorted face) would pass through un-replaced. This commit eliminates the manual step by generating the regex automatically from `Emoji.unicode_replacements` during `rake javascript:update_constants` — the same task that already generates the emoji names, aliases, and replacements map. A new `Emoji::RegexGenerator` module builds a trie from all emoji Unicode sequences (converted to UTF-16 code units for JS compatibility), then emits an optimized regex pattern with character class ranges and shared-prefix grouping. The generated regex is exported from `pretty-text/emoji/data.js` alongside the other emoji constants, and `emoji.js` now imports it instead of hardcoding it. The generated regex matches all 3,418 emoji keys (including the 43 Unicode 17.0 emoji the old regex missed), is ~20% faster in benchmarks, and can never drift from the emoji database again. Closes #38416 |
||
|---|---|---|
| .. | ||
| regex_generator.rb | ||