discourse/lib/discourse_plugin_registry.rb
Joffrey JAFFEUX 7ecb945ec4
FEATURE: Add full-text search for chat messages (#34704)
## Overview

This PR introduces comprehensive search functionality for chat messages,
enabling users to search through their chat history both globally across
all accessible channels and within specific channels.

### Search Capabilities

**All-Channel Search**: When no channel is specified, users can search
across all channels they have access to. The search respects channel
permissions through `ChannelFetcher.all_secured_channel_ids`, ensuring
users only see results from channels they can view.

**Per-Channel Search**: Users can scope their search to a specific
channel by providing a `channel_id` parameter, useful for finding
messages within a particular conversation context.

**Search Features**:
- Full-text search using PostgreSQL's tsvector/tsquery
- Advanced filters: `@username` to filter by author, `#channel` to
filter by channel slug
- Sort options: relevance (default) or latest
- Pagination support
- Search data weighted by relevance

## Site Setting: `chat_search_enabled`

This feature is gated behind the `chat_search_enabled` site setting,
which is currently:
- **Default**: `false`
- **Hidden**: `true`
- **Client-accessible**: `true`

### Deployment Strategy

Due to the need for chat messages to be indexed before search becomes
useful, we're implementing a two-phase deployment:

**Phase 1 (Initial Merge)**:
- `chat_search_enabled` remains `false` and hidden
- The `register_search_index` uses default (true) instead of `chat_search_enabled` value
- This allows the reindexing infrastructure to begin indexing existing
chat messages even if we don't show the UI yet

**Wait Period**:
- Wait at least one week after Phase 1 deployment
- `Jobs::ReindexSearch` runs every 2 hours and will progressively index
all chat messages
- This ensures most sites have a significant part of their chat history indexed

**Phase 2 (Follow-up Merge)**:
- Set `chat_search_enabled` default to `true` and unhide it
- Update the `register_search_index` enabled proc uses the default
(true) instead of using the `chat_search_enabled` setting
- Users can now access search with pre-indexed data

**Rationale**: Without this phased approach, users would see the search
UI immediately but receive no results until the reindexing job runs,
creating a confusing experience. By pre-indexing while the UI is hidden,
we ensure search works immediately when enabled.

## New Plugin API: `register_search_index`

This PR introduces a new plugin API that allows plugins to register
custom search indexes that integrate seamlessly with Discourse's search
infrastructure.

### API Signature

```ruby
register_search_index(
  model_class:,              # The ActiveRecord model to index
  search_data_class:,        # The model for storing search data
  index_version:,            # Version number for re-indexing
  search_data:,              # Proc that returns weighted search data
  load_unindexed_record_ids:,# Proc that finds records needing indexing
  enabled:                   # Optional proc to enable/disable (default: -> { true })
)
```

### How It Works

**Integration with SearchIndexer**: When `SearchIndexer.index(obj)` is
called, it checks registered search handlers for the object's type. If a
handler matches, it:
1. Calls the `search_data` proc with the object and an `IndexerHelper`
instance
2. Receives weighted search data (`:a_weight`, `:b_weight`, `:c_weight`,
`:d_weight`)
3. Updates the corresponding search data table with PostgreSQL's
tsvector

**Integration with Jobs::ReindexSearch**: The scheduled job (runs every
2 hours) calls `rebuild_registered_search_handlers`, which:
1. Iterates through all registered search handlers
2. Skips handlers where `enabled` proc returns `false`
3. Calls `load_unindexed_record_ids` to find records needing indexing
4. Indexes up to `limit` records per handler (default: 10,000)

### Chat Implementation Example

```ruby
register_search_index(
  model_class: Chat::Message,
  search_data_class: Chat::MessageSearchData,
  index_version: 1,
  search_data: proc { |message, indexer_helper|
    {
      a_weight: message.message,
      d_weight: indexer_helper.scrub_html(message.cooked)[0..600_000]
    }
  },
  load_unindexed_record_ids: proc { |limit:, index_version:|
    Chat::Message
      .joins("LEFT JOIN chat_message_search_data ON chat_message_id = chat_messages.id")
      .where(
        "chat_message_search_data.locale IS NULL OR 
         chat_message_search_data.locale != ? OR 
         chat_message_search_data.version != ?",
        SiteSetting.default_locale,
        index_version
      )
      .order("chat_messages.id ASC")
      .limit(limit)
      .pluck(:id)
  }
)
```

Co-authored-by: Martin Brennan <mjrbrennan@gmail.com>
Co-authored-by: Loïc Guitaut <5648+Flink@users.noreply.github.com>
2025-10-22 11:30:35 +02:00

319 lines
11 KiB
Ruby
Vendored

# frozen_string_literal: true
# A class that handles interaction between a plugin and the Discourse App.
#
class DiscoursePluginRegistry
@@register_names = Set.new
# Plugins often need to be able to register additional handlers, data, or
# classes that will be used by core classes. This should be used if you
# need to control which type the registry is, and if it doesn't need to
# be removed if the plugin is disabled.
#
# Shortcut to create new register in the plugin registry
# - Register is created in a class variable using the specified name/type
# - Defines singleton method to access the register
# - Defines instance method as a shortcut to the singleton method
# - Automatically deletes the register on registry.reset!
def self.define_register(register_name, type)
return if respond_to?(register_name)
@@register_names << register_name
define_singleton_method(register_name) do
instance_variable_get(:"@#{register_name}") ||
instance_variable_set(:"@#{register_name}", type.new)
end
define_method(register_name) { self.class.public_send(register_name) }
end
# Plugins often need to add values to a list, and we need to filter those
# lists at runtime to ignore values from disabled plugins. Unlike define_register,
# the type of the register cannot be defined, and is always Array.
#
# Create a new register (see `define_register`) with some additions:
# - Register is created in a class variable using the specified name/type
# - Defines singleton method to access the register
# - Defines instance method as a shortcut to the singleton method
# - Automatically deletes the register on registry.reset!
def self.define_filtered_register(register_name)
return if respond_to?(register_name)
define_register(register_name, Array)
singleton_class.alias_method :"_raw_#{register_name}", :"#{register_name}"
define_singleton_method(register_name) do
public_send(:"_raw_#{register_name}").filter_map { |h| h[:value] if h[:plugin].enabled? }.uniq
end
define_singleton_method("register_#{register_name.to_s.singularize}") do |value, plugin|
public_send(:"_raw_#{register_name}") << { plugin: plugin, value: value }
end
yield(self) if block_given?
end
define_register :javascripts, Set
define_register :auth_providers, Set
define_register :service_workers, Set
define_register :stylesheets, Hash
define_register :mobile_stylesheets, Hash
define_register :desktop_stylesheets, Hash
define_register :color_definition_stylesheets, Hash
define_register :serialized_current_user_fields, Set
define_register :seed_data, ActiveSupport::HashWithIndifferentAccess
define_register :locales, ActiveSupport::HashWithIndifferentAccess
define_register :svg_icons, Set
define_register :custom_html, Hash
define_register :html_builders, Hash
define_register :seed_path_builders, Set
define_register :vendored_pretty_text, Set
define_register :vendored_core_pretty_text, Set
define_register :seedfu_filter, Set
define_register :demon_processes, Set
define_register :groups_callback_for_users_search_controller_action, Hash
define_register :mail_pollers, Set
define_register :site_setting_areas, Set
define_register :admin_config_login_routes, Set
define_register :discourse_dev_populate_reviewable_types, Set
define_register :category_update_param_with_callback, Hash
define_filtered_register :staff_user_custom_fields
define_filtered_register :public_user_custom_fields
define_filtered_register :staff_editable_topic_custom_fields
define_filtered_register :public_editable_topic_custom_fields
define_filtered_register :self_editable_user_custom_fields
define_filtered_register :staff_editable_user_custom_fields
define_filtered_register :editable_group_custom_fields
define_filtered_register :group_params
define_filtered_register :topic_thumbnail_sizes
define_filtered_register :topic_preloader_associations
define_filtered_register :category_list_topics_preloader_associations
define_filtered_register :api_parameter_routes
define_filtered_register :api_key_scope_mappings
define_filtered_register :user_api_key_scope_mappings
define_filtered_register :permitted_bulk_action_parameters
define_filtered_register :reviewable_params
define_filtered_register :reviewable_score_links
define_filtered_register :presence_channel_prefixes
define_filtered_register :email_notification_filters
define_filtered_register :push_notification_filters
define_filtered_register :notification_consolidation_plans
define_filtered_register :email_unsubscribers
define_filtered_register :user_destroyer_on_content_deletion_callbacks
define_filtered_register :hashtag_autocomplete_data_sources
define_filtered_register :hashtag_autocomplete_contextual_type_priorities
define_filtered_register :search_groups_set_query_callbacks
define_filtered_register :search_handlers
define_filtered_register :stats
define_filtered_register :bookmarkables
define_filtered_register :list_suggested_for_providers
define_filtered_register :post_action_notify_user_handlers
define_filtered_register :post_strippers
define_filtered_register :problem_checks
define_filtered_register :flag_applies_to_types
define_filtered_register :custom_filter_mappings
define_filtered_register :reviewable_types do |singleton|
singleton.define_singleton_method("reviewable_types_lookup") do
public_send(:"_raw_reviewable_types")
.filter_map { |h| { plugin: h[:plugin].name, klass: h[:value] } if h[:plugin].enabled? }
.uniq
end
end
def self.register_auth_provider(auth_provider)
self.auth_providers << auth_provider
end
def self.register_mail_poller(mail_poller)
self.mail_pollers << mail_poller
end
def register_js(filename, options = {})
# If we have a server side option, add that too.
self.class.javascripts << filename
end
def self.register_service_worker(filename, options = {})
self.service_workers << filename
end
def self.register_svg_icon(icon)
self.svg_icons << icon.strip
end
def register_css(filename, plugin_directory_name)
self.class.stylesheets[plugin_directory_name] ||= Set.new
self.class.stylesheets[plugin_directory_name] << filename
end
def self.register_locale(locale, options = {})
self.locales[locale] = options
end
def self.unregister_locale(locale)
raise "unregister_locale can only be used in tests" if !Rails.env.test?
self.locales.delete(locale)
end
def register_archetype(name, options = {})
Archetype.register(name, options)
end
JS_REGEX = /\.js$|\.js\.erb$|\.js\.es6\z/
def self.register_asset(asset, opts = nil, plugin_directory_name = nil)
if asset =~ JS_REGEX
if opts == :vendored_pretty_text
self.vendored_pretty_text << asset
elsif opts == :vendored_core_pretty_text
self.vendored_core_pretty_text << asset
else
self.javascripts << asset
end
elsif asset =~ /\.css$|\.scss\z/
if opts == :mobile
self.mobile_stylesheets[plugin_directory_name] ||= Set.new
self.mobile_stylesheets[plugin_directory_name] << asset
elsif opts == :desktop
self.desktop_stylesheets[plugin_directory_name] ||= Set.new
self.desktop_stylesheets[plugin_directory_name] << asset
elsif opts == :color_definitions
self.color_definition_stylesheets[plugin_directory_name] = asset
else
self.stylesheets[plugin_directory_name] ||= Set.new
self.stylesheets[plugin_directory_name] << asset
end
end
end
def self.stylesheets_exists?(plugin_directory_name, target = nil)
case target
when :desktop
self.desktop_stylesheets[plugin_directory_name].present?
when :mobile
self.mobile_stylesheets[plugin_directory_name].present?
else
self.stylesheets[plugin_directory_name].present?
end
end
def self.register_seed_data(key, value)
self.seed_data[key] = value
end
def self.register_seed_path_builder(&block)
seed_path_builders << block
end
def self.register_html_builder(name, &block)
html_builders[name] ||= []
html_builders[name] << block
end
def self.build_html(name, ctx = nil, **kwargs)
builders = html_builders[name] || []
builders.map { |b| b.call(ctx, **kwargs) }.join("\n").html_safe
end
def self.seed_paths
result = SeedFu.fixture_paths.dup
unless Rails.env.test? && ENV["LOAD_PLUGINS"] != "1"
seed_path_builders.each { |b| result += b.call }
end
result.uniq
end
def self.register_seedfu_filter(filter = nil)
self.seedfu_filter << filter
end
VENDORED_CORE_PRETTY_TEXT_MAP = {
"moment.js" => "app/assets/javascripts/discourse/node_modules/moment/moment.js",
"moment-timezone.js" =>
"app/assets/javascripts/discourse/node_modules/moment-timezone/builds/moment-timezone-with-data.js",
}
def self.core_asset_for_name(name)
asset = VENDORED_CORE_PRETTY_TEXT_MAP[name]
raise KeyError, "Asset #{name} not found in #{VENDORED_CORE_PRETTY_TEXT_MAP}" unless asset
asset
end
def self.clear_modifiers!
if Rails.env.test? && GlobalSetting.load_plugins?
raise "Clearing modifiers during a plugin spec run will affect all future specs. Use unregister_modifier instead."
end
@modifiers = nil
end
def self.register_modifier(plugin_instance, name, &blk)
@modifiers ||= {}
modifiers = @modifiers[name] ||= []
modifiers << [plugin_instance, blk]
end
def self.unregister_modifier(plugin_instance, name, &blk)
raise "unregister_modifier can only be used in tests" if !Rails.env.test?
modifiers_for_name = @modifiers&.[](name)
raise "no #{name} modifiers found" if !modifiers_for_name
i = modifiers_for_name.find_index { |info| info == [plugin_instance, blk] }
raise "no modifier found for that plugin/block combination" if !i
modifiers_for_name.delete_at(i)
end
def self.apply_modifier(name, arg, *more_args)
return arg if !@modifiers
registered_modifiers = @modifiers[name]
return arg if !registered_modifiers
# iterate as fast as possible to minimize cost (avoiding each)
# also erases one stack frame
length = registered_modifiers.length
index = 0
while index < length
plugin_instance, block = registered_modifiers[index]
arg = block.call(arg, *more_args) if plugin_instance.enabled?
index += 1
end
arg
end
def self.reset!
@@register_names.each { |name| instance_variable_set(:"@#{name}", nil) }
clear_modifiers!
end
def self.reset_register!(register_name)
found_register = @@register_names.detect { |name| name == register_name }
instance_variable_set(:"@#{found_register}", nil) if found_register
end
end