discourse/spec/support
Joffrey JAFFEUX 7ecb945ec4
FEATURE: Add full-text search for chat messages (#34704)
## Overview

This PR introduces comprehensive search functionality for chat messages,
enabling users to search through their chat history both globally across
all accessible channels and within specific channels.

### Search Capabilities

**All-Channel Search**: When no channel is specified, users can search
across all channels they have access to. The search respects channel
permissions through `ChannelFetcher.all_secured_channel_ids`, ensuring
users only see results from channels they can view.

**Per-Channel Search**: Users can scope their search to a specific
channel by providing a `channel_id` parameter, useful for finding
messages within a particular conversation context.

**Search Features**:
- Full-text search using PostgreSQL's tsvector/tsquery
- Advanced filters: `@username` to filter by author, `#channel` to
filter by channel slug
- Sort options: relevance (default) or latest
- Pagination support
- Search data weighted by relevance

## Site Setting: `chat_search_enabled`

This feature is gated behind the `chat_search_enabled` site setting,
which is currently:
- **Default**: `false`
- **Hidden**: `true`
- **Client-accessible**: `true`

### Deployment Strategy

Due to the need for chat messages to be indexed before search becomes
useful, we're implementing a two-phase deployment:

**Phase 1 (Initial Merge)**:
- `chat_search_enabled` remains `false` and hidden
- The `register_search_index` uses default (true) instead of `chat_search_enabled` value
- This allows the reindexing infrastructure to begin indexing existing
chat messages even if we don't show the UI yet

**Wait Period**:
- Wait at least one week after Phase 1 deployment
- `Jobs::ReindexSearch` runs every 2 hours and will progressively index
all chat messages
- This ensures most sites have a significant part of their chat history indexed

**Phase 2 (Follow-up Merge)**:
- Set `chat_search_enabled` default to `true` and unhide it
- Update the `register_search_index` enabled proc uses the default
(true) instead of using the `chat_search_enabled` setting
- Users can now access search with pre-indexed data

**Rationale**: Without this phased approach, users would see the search
UI immediately but receive no results until the reindexing job runs,
creating a confusing experience. By pre-indexing while the UI is hidden,
we ensure search works immediately when enabled.

## New Plugin API: `register_search_index`

This PR introduces a new plugin API that allows plugins to register
custom search indexes that integrate seamlessly with Discourse's search
infrastructure.

### API Signature

```ruby
register_search_index(
  model_class:,              # The ActiveRecord model to index
  search_data_class:,        # The model for storing search data
  index_version:,            # Version number for re-indexing
  search_data:,              # Proc that returns weighted search data
  load_unindexed_record_ids:,# Proc that finds records needing indexing
  enabled:                   # Optional proc to enable/disable (default: -> { true })
)
```

### How It Works

**Integration with SearchIndexer**: When `SearchIndexer.index(obj)` is
called, it checks registered search handlers for the object's type. If a
handler matches, it:
1. Calls the `search_data` proc with the object and an `IndexerHelper`
instance
2. Receives weighted search data (`:a_weight`, `:b_weight`, `:c_weight`,
`:d_weight`)
3. Updates the corresponding search data table with PostgreSQL's
tsvector

**Integration with Jobs::ReindexSearch**: The scheduled job (runs every
2 hours) calls `rebuild_registered_search_handlers`, which:
1. Iterates through all registered search handlers
2. Skips handlers where `enabled` proc returns `false`
3. Calls `load_unindexed_record_ids` to find records needing indexing
4. Indexes up to `limit` records per handler (default: 10,000)

### Chat Implementation Example

```ruby
register_search_index(
  model_class: Chat::Message,
  search_data_class: Chat::MessageSearchData,
  index_version: 1,
  search_data: proc { |message, indexer_helper|
    {
      a_weight: message.message,
      d_weight: indexer_helper.scrub_html(message.cooked)[0..600_000]
    }
  },
  load_unindexed_record_ids: proc { |limit:, index_version:|
    Chat::Message
      .joins("LEFT JOIN chat_message_search_data ON chat_message_id = chat_messages.id")
      .where(
        "chat_message_search_data.locale IS NULL OR 
         chat_message_search_data.locale != ? OR 
         chat_message_search_data.version != ?",
        SiteSetting.default_locale,
        index_version
      )
      .order("chat_messages.id ASC")
      .limit(limit)
      .pluck(:id)
  }
)
```

Co-authored-by: Martin Brennan <mjrbrennan@gmail.com>
Co-authored-by: Loïc Guitaut <5648+Flink@users.noreply.github.com>
2025-10-22 11:30:35 +02:00
..
locales DEV: Upgrade Rails to version 7.1 2024-07-04 10:58:21 +02:00
shared_examples FEATURE: Themeable site settings (#32233) 2025-07-16 11:00:21 +10:00
backups_helpers.rb DEV: Add backup helpers for specs (#28394) 2024-08-16 14:51:57 +10:00
bookmarkable_helper.rb DEV: Change Bookmarkable registration to DiscoursePluginRegistry (#20556) 2023-03-08 10:39:12 +10:00
concurrency.rb DEV: Fix various rubocop lints (#24749) 2023-12-06 23:25:00 +01:00
diagnostics_helper.rb DEV: track client ids published to message bus (#32878) 2025-05-23 14:23:50 +10:00
discourse_connect_support_helpers.rb DEV: Improvements to DiscourseConnect spec helpers (#35173) 2025-10-03 14:53:56 +01:00
discourse_event_helper.rb DEV: Apply syntax_tree formatting to spec/* 2023-01-09 11:49:28 +00:00
dom_matcher.rb DEV: Update minitest to 5.19.0 (#22821) 2023-07-27 12:18:40 +02:00
fake_bookmark_hashtag_data_source.rb DEV: Introduce enabled? API to hashtag data sources (#22632) 2023-07-18 09:39:01 +10:00
fake_logger.rb DEV: Upgrade Rails to version 7.1 2024-07-04 10:58:21 +02:00
fake_s3.rb DEV: Bump aws-sdk-core in prep for aws-sdk-mediaconvert (#33250) 2025-06-20 16:41:01 -06:00
fast_image_helpers.rb FIX: remove 'crawl_images' site setting (#14646) 2021-10-19 17:12:29 +05:30
final_destination_helper.rb Revert "DEV: Allow webmock to intercept FinalDestination::HTTP requests (#20575)" (#20576) 2023-03-08 11:26:32 +08:00
helpers.rb FIX: Theme site settings not reloading across processes (#34242) 2025-08-12 14:18:32 +10:00
i18n_helpers.rb DEV: Upgrade Rails to version 7.1 2024-07-04 10:58:21 +02:00
imap_helper.rb DEV: lint against Layout/EmptyLineBetweenDefs (#24914) 2023-12-15 23:46:04 +08:00
integration_helpers.rb DEV: Finish renaming secure_session to server_session 2025-09-23 10:35:02 +02:00
match_html_matcher.rb DEV: Remove invalid parsing options (#30545) 2025-01-03 13:17:49 +01:00
mock_git_importer.rb
negated_matcher.rb
omniauth_helpers.rb DEV: More system specs for signup/login (#27150) 2024-05-23 10:01:05 -03:00
onebox_helpers.rb DEV: Introduce core features system specs for plugins 2025-03-27 12:12:01 +01:00
problem_check_matcher.rb DEV: Move non scheduled problem checks to classes (#26122) 2024-03-14 10:55:01 +08:00
rate_limit_matcher.rb DEV: Apply syntax_tree formatting to spec/* 2023-01-09 11:49:28 +00:00
sample_plugin_site_settings.yml FIX: Sort plugins by their setting category name (#25128) 2024-01-08 09:57:25 +10:00
service_matchers.rb DEV: Display better output when inspecting service steps 2024-12-12 15:21:10 +01:00
sidekiq_helpers.rb FIX: send email to normalized email owner when hiding emails (#23524) 2023-09-12 11:06:35 +10:00
site_settings_helpers.rb DEV: Avoid leaking new site setting states in test environment (#21713) 2023-05-25 07:53:57 +08:00
system_helpers.rb FEATURE: Add full-text search for chat messages (#34704) 2025-10-22 11:30:35 +02:00
test_second_factor_action.rb DEV: Apply syntax_tree formatting to spec/* 2023-01-09 11:49:28 +00:00
time_matcher.rb DEV: support nil values in the eq_time matcher (#22116) 2023-06-20 19:06:40 +04:00
topic_guardian_can_see_consistency_check.rb DEV: Apply syntax_tree formatting to spec/* 2023-01-09 11:49:28 +00:00
ts_vector_matcher.rb FIX: domain searches not working properly for URLs (#20136) 2023-02-03 09:55:28 +11:00
uploads_helpers.rb FIX: Use dualstack S3 endpoint for direct uploads (#29611) 2024-11-07 11:06:39 +10:00
webauthn_integration_helpers.rb DEV: Refactor webauthn to support passkeys (1/3) (#23586) 2023-10-03 14:59:28 -04:00