discourse/app/services
Joffrey JAFFEUX 7ecb945ec4
FEATURE: Add full-text search for chat messages (#34704)
## Overview

This PR introduces comprehensive search functionality for chat messages,
enabling users to search through their chat history both globally across
all accessible channels and within specific channels.

### Search Capabilities

**All-Channel Search**: When no channel is specified, users can search
across all channels they have access to. The search respects channel
permissions through `ChannelFetcher.all_secured_channel_ids`, ensuring
users only see results from channels they can view.

**Per-Channel Search**: Users can scope their search to a specific
channel by providing a `channel_id` parameter, useful for finding
messages within a particular conversation context.

**Search Features**:
- Full-text search using PostgreSQL's tsvector/tsquery
- Advanced filters: `@username` to filter by author, `#channel` to
filter by channel slug
- Sort options: relevance (default) or latest
- Pagination support
- Search data weighted by relevance

## Site Setting: `chat_search_enabled`

This feature is gated behind the `chat_search_enabled` site setting,
which is currently:
- **Default**: `false`
- **Hidden**: `true`
- **Client-accessible**: `true`

### Deployment Strategy

Due to the need for chat messages to be indexed before search becomes
useful, we're implementing a two-phase deployment:

**Phase 1 (Initial Merge)**:
- `chat_search_enabled` remains `false` and hidden
- The `register_search_index` uses default (true) instead of `chat_search_enabled` value
- This allows the reindexing infrastructure to begin indexing existing
chat messages even if we don't show the UI yet

**Wait Period**:
- Wait at least one week after Phase 1 deployment
- `Jobs::ReindexSearch` runs every 2 hours and will progressively index
all chat messages
- This ensures most sites have a significant part of their chat history indexed

**Phase 2 (Follow-up Merge)**:
- Set `chat_search_enabled` default to `true` and unhide it
- Update the `register_search_index` enabled proc uses the default
(true) instead of using the `chat_search_enabled` setting
- Users can now access search with pre-indexed data

**Rationale**: Without this phased approach, users would see the search
UI immediately but receive no results until the reindexing job runs,
creating a confusing experience. By pre-indexing while the UI is hidden,
we ensure search works immediately when enabled.

## New Plugin API: `register_search_index`

This PR introduces a new plugin API that allows plugins to register
custom search indexes that integrate seamlessly with Discourse's search
infrastructure.

### API Signature

```ruby
register_search_index(
  model_class:,              # The ActiveRecord model to index
  search_data_class:,        # The model for storing search data
  index_version:,            # Version number for re-indexing
  search_data:,              # Proc that returns weighted search data
  load_unindexed_record_ids:,# Proc that finds records needing indexing
  enabled:                   # Optional proc to enable/disable (default: -> { true })
)
```

### How It Works

**Integration with SearchIndexer**: When `SearchIndexer.index(obj)` is
called, it checks registered search handlers for the object's type. If a
handler matches, it:
1. Calls the `search_data` proc with the object and an `IndexerHelper`
instance
2. Receives weighted search data (`:a_weight`, `:b_weight`, `:c_weight`,
`:d_weight`)
3. Updates the corresponding search data table with PostgreSQL's
tsvector

**Integration with Jobs::ReindexSearch**: The scheduled job (runs every
2 hours) calls `rebuild_registered_search_handlers`, which:
1. Iterates through all registered search handlers
2. Skips handlers where `enabled` proc returns `false`
3. Calls `load_unindexed_record_ids` to find records needing indexing
4. Indexes up to `limit` records per handler (default: 10,000)

### Chat Implementation Example

```ruby
register_search_index(
  model_class: Chat::Message,
  search_data_class: Chat::MessageSearchData,
  index_version: 1,
  search_data: proc { |message, indexer_helper|
    {
      a_weight: message.message,
      d_weight: indexer_helper.scrub_html(message.cooked)[0..600_000]
    }
  },
  load_unindexed_record_ids: proc { |limit:, index_version:|
    Chat::Message
      .joins("LEFT JOIN chat_message_search_data ON chat_message_id = chat_messages.id")
      .where(
        "chat_message_search_data.locale IS NULL OR 
         chat_message_search_data.locale != ? OR 
         chat_message_search_data.version != ?",
        SiteSetting.default_locale,
        index_version
      )
      .order("chat_messages.id ASC")
      .limit(limit)
      .pluck(:id)
  }
)
```

Co-authored-by: Martin Brennan <mjrbrennan@gmail.com>
Co-authored-by: Loïc Guitaut <5648+Flink@users.noreply.github.com>
2025-10-22 11:30:35 +02:00
..
admin_notices DEV: Apply new Rubocop linting on services 2024-12-02 17:31:36 +01:00
discourse_id FIX: add support for subfolder in discourse-id registration (#35011) 2025-09-29 20:06:57 +02:00
experiments DEV: Apply new Rubocop linting on services 2024-12-02 17:31:36 +01:00
flags FEATURE: allow edit custom flags (#32344) 2025-04-17 12:31:52 +08:00
notifications FEATURE: Consolidate link notifications (#26567) 2024-04-09 11:53:37 -06:00
problem_check DEV: Remove full_page_login setting (#32189) 2025-04-29 10:40:40 +02:00
site_setting DEV: Move backfill into SiteSetting::Update service (#32037) 2025-03-28 12:01:56 +08:00
spam_rule FIX: Moderator notifications when new post auto-silences a user (#35403) 2025-10-15 16:07:56 +08:00
themes FIX: Add delete button to themes grid (#34606) 2025-08-29 10:09:23 +08:00
user DEV: Add a compact_blank option to the ActiveModel array type (#35476) 2025-10-20 11:33:36 +02:00
video_conversion DEV: Have media convert service set s3 output permissions (#35392) 2025-10-15 12:38:25 -06:00
anonymous_shadow_creator.rb UX: Improve naming for anonymous mode settings (#31832) 2025-03-21 04:54:06 +03:00
badge_granter.rb FIX: error when trying to un-favorite badge (#32369) 2025-04-22 15:36:48 +08:00
base_bookmarkable.rb FIX: Show deleted bookmark reminders in user bookmarks menu (#25905) 2024-02-29 09:03:49 +10:00
category_hashtag_data_source.rb FEATURE: add icons and emojis to category (#31795) 2025-03-26 09:46:17 +04:00
color_scheme_revisor.rb FEATURE: Allow editing theme-owned palettes (#34722) 2025-10-06 09:02:39 +03:00
destroy_task.rb FIX: Confirmation prompt breaks when using pipe (#35261) 2025-10-08 11:12:09 +08:00
email_settings_exception_handler.rb FIX: Show the SMTP authentication error for group UI (#27914) 2024-07-16 09:14:17 +10:00
email_settings_validator.rb UX: Use a dropdown for SSL mode for group SMTP (#27932) 2024-07-18 10:33:14 +10:00
email_style_updater.rb DEV: Apply syntax_tree formatting to app/* 2023-01-09 14:14:59 +00:00
external_upload_manager.rb DEV: lint against Layout/EmptyLineBetweenDefs (#24914) 2023-12-15 23:46:04 +08:00
group_action_logger.rb DEV: Apply syntax_tree formatting to app/* 2023-01-09 14:14:59 +00:00
group_mentions_updater.rb DEV: Apply syntax_tree formatting to app/* 2023-01-09 14:14:59 +00:00
group_message.rb DEV: Don't allow context-free system post destruction (#32523) 2025-05-05 09:58:29 +08:00
handle_chunk_upload.rb DEV: Apply syntax_tree formatting to app/* 2023-01-09 14:14:59 +00:00
hashtag_autocomplete_service.rb FEATURE: add icons and emojis to category (#31795) 2025-03-26 09:46:17 +04:00
heat_settings_updater.rb DEV: Apply syntax_tree formatting to app/* 2023-01-09 14:14:59 +00:00
inline_uploads.rb FIX: Use filename as alt for hotlinked image uploads (#31651) 2025-03-11 14:45:06 +11:00
locale_normalizer.rb FIX: Show localization for regionless locale if they exist (#33702) 2025-07-21 15:45:14 +08:00
notification_emailer.rb DEV: Bump rubocop_discourse (#29608) 2024-11-06 06:27:49 +08:00
post_action_notifier.rb DEV: Apply syntax_tree formatting to app/* 2023-01-09 14:14:59 +00:00
post_alerter.rb FEATURE: disable link notification user preference (#35352) 2025-10-14 10:53:05 +02:00
post_bookmarkable.rb FIX: Serialize categories for bookmarks (#26606) 2024-04-17 17:23:47 +03:00
post_owner_changer.rb DEV: Apply syntax_tree formatting to app/* 2023-01-09 14:14:59 +00:00
push_notification_pusher.rb FIX: avoid double base path on push notification (#32228) 2025-04-12 08:16:53 +10:00
random_topic_selector.rb DEV: Remove Discourse.redis.delete_prefixed (#22103) 2023-06-16 12:44:35 +10:00
registered_bookmarkable.rb FIX: Show deleted bookmark reminders in user bookmarks menu (#25905) 2024-02-29 09:03:49 +10:00
search_indexer.rb FEATURE: Add full-text search for chat messages (#34704) 2025-10-22 11:30:35 +02:00
sidebar_section_links_updater.rb DEV: Limit the number of category sidebar links a user can have (#26756) 2024-04-25 13:21:39 -05:00
sidebar_site_settings_backfiller.rb DEV: Drop distributed mutex fromSidebarSiteSettingsBackfiller#backfill! (#25674) 2024-02-15 06:21:03 +08:00
site_setting_update_existing_users.rb FIX: Timeout issue when updating a large collection of users when changing the default_categories_* and default_tags_* SiteSettings (#33665) 2025-08-20 12:55:53 -05:00
site_settings_task.rb FEATURE: mandatory fields for group site setting (#26612) 2024-04-18 08:53:52 +10:00
staff_action_logger.rb DEV: Allow impersonation without session swapping (#34213) 2025-08-21 14:18:15 +08:00
tag_hashtag_data_source.rb DEV: add tag hashtag data source style type (#33289) 2025-06-20 18:08:47 +04:00
theme_settings_migrations_runner.rb DEV: Rename theme-transpiler to asset-processor (#35498) 2025-10-20 14:16:46 +01:00
themes_install_task.rb FEATURE: Theme settings migrations (#24071) 2023-11-02 08:10:15 +03:00
topic_bookmarkable.rb FIX: Serialize categories for bookmarks (#26606) 2024-04-17 17:23:47 +03:00
topic_status_updater.rb FIX: Message for bulk closing topics silently (#27400) 2024-06-11 09:36:54 +10:00
topic_timestamp_changer.rb DEV: Apply syntax_tree formatting to app/* 2023-01-09 14:14:59 +00:00
tracked_topics_updater.rb DEV: Apply syntax_tree formatting to app/* 2023-01-09 14:14:59 +00:00
trust_level_granter.rb DEV: Apply syntax_tree formatting to app/* 2023-01-09 14:14:59 +00:00
user_action_manager.rb FIX: permanent delete of posts by deleted users (#28992) 2024-09-24 12:26:31 +03:00
user_activator.rb DEV: Move more data into the server session (#35145) 2025-10-03 10:20:32 +02:00
user_anonymizer.rb FEATURE: Add option to hide full name input at signup (#30471) 2024-12-30 22:26:20 +03:00
user_authenticator.rb DEV: Move more data into the server session (#35145) 2025-10-03 10:20:32 +02:00
user_destroyer.rb DEV: Hand-pick Rails/WhereNot autofixes (#35117) 2025-10-03 13:29:22 +02:00
user_merger.rb DEV: rename topic_id to timerable_id for BaseTimer (#34667) 2025-09-17 13:19:17 +08:00
user_notification_renderer.rb DEV: Apply syntax_tree formatting to app/* 2023-01-09 14:14:59 +00:00
user_notification_schedule_processor.rb DEV: Apply syntax_tree formatting to app/* 2023-01-09 14:14:59 +00:00
user_password_expirer.rb DEV: Migrate user passwords data to UserPassword table (#28746) 2024-10-10 09:23:06 +08:00
user_silencer.rb FIX: Moderator notifications when new post auto-silences a user (#35403) 2025-10-15 16:07:56 +08:00
user_stat_count_updater.rb DEV: Apply syntax_tree formatting to app/* 2023-01-09 14:14:59 +00:00
user_suspender.rb SECURITY: Don't allow suspending staff users via other_user_ids param 2024-07-03 20:49:29 +08:00
user_updater.rb FEATURE: disable link notification user preference (#35352) 2025-10-14 10:53:05 +02:00
username_changer.rb DEV: Apply syntax_tree formatting to app/* 2023-01-09 14:14:59 +00:00
username_checker_service.rb DEV: Apply syntax_tree formatting to app/* 2023-01-09 14:14:59 +00:00
web_hook_emitter.rb DEV: Move webhook event header modifier for redelivery-recalucation (#27177) 2024-05-24 10:37:10 -05:00
wildcard_domain_checker.rb DEV: Apply syntax_tree formatting to app/* 2023-01-09 14:14:59 +00:00
wildcard_url_checker.rb DEV: Apply syntax_tree formatting to app/* 2023-01-09 14:14:59 +00:00
word_watcher.rb FIX: wildcard watched word and regexps (#35217) 2025-10-06 20:49:02 +02:00