discourse

mirror of https://gh.wpcy.net/https://github.com/discourse/discourse.git synced 2026-06-19 02:05:37 +08:00

History

Sam b8abe100c5 FEATURE: add agentic execution mode for AI personas (#38230 ) Introduce an "agentic" execution mode as an alternative to the default fixed-turn/tool-limit approach. In agentic mode, personas use a configurable token budget (`max_turn_tokens`) to govern how long a tool-use session can run, with automatic context compression when the conversation exceeds a configurable threshold percentage (`compression_threshold`) of the model's context window. Key changes: - Add `execution_mode`, `max_turn_tokens`, and `compression_threshold` columns to `ai_personas` via migration - Refactor `Bot#reply` to support token-budget loop control with a thread-local token accumulator, budget exhaustion hints, and a safety valve at 100 completions - Add `maybe_compress_context` which summarizes middle conversation messages when token usage crosses the compression threshold, preserving system prompt and recent tail messages - Update `StreamReplyCustomToolsSession` to track accumulated tokens across rounds and handle budget exhaustion in the custom tools path - Discount cached tokens (Anthropic) in the token accumulator to avoid over-counting reused KV cache prefixes - Update persona editor UI with execution mode selector and conditional fields (agentic shows token budget/compression; default shows max context posts)	2026-03-05 15:06:54 +11:00
..
prompt_evaluator.rb	FEATURE: add agentic execution mode for AI personas (#38230 )	2026-03-05 15:06:54 +11:00
single_test_runner.rb	FEATURE: add agentic execution mode for AI personas (#38230 )	2026-03-05 15:06:54 +11:00

FEATURE: add agentic execution mode for AI personas (#38230 )

Introduce an "agentic" execution mode as an alternative to the
default fixed-turn/tool-limit approach. In agentic mode, personas
use a configurable token budget (`max_turn_tokens`) to govern how
long a tool-use session can run, with automatic context compression
when the conversation exceeds a configurable threshold percentage
(`compression_threshold`) of the model's context window.

Key changes:

- Add `execution_mode`, `max_turn_tokens`, and `compression_threshold`
  columns to `ai_personas` via migration
- Refactor `Bot#reply` to support token-budget loop control with a
  thread-local token accumulator, budget exhaustion hints, and a
  safety valve at 100 completions
- Add `maybe_compress_context` which summarizes middle conversation
  messages when token usage crosses the compression threshold,
  preserving system prompt and recent tail messages
- Update `StreamReplyCustomToolsSession` to track accumulated tokens
  across rounds and handle budget exhaustion in the custom tools path
- Discount cached tokens (Anthropic) in the token accumulator to
  avoid over-counting reused KV cache prefixes
- Update persona editor UI with execution mode selector and
  conditional fields (agentic shows token budget/compression;
  default shows max context posts)

2026-03-05 15:06:54 +11:00

prompt_evaluator.rb

FEATURE: add agentic execution mode for AI personas (#38230 )

2026-03-05 15:06:54 +11:00

single_test_runner.rb

FEATURE: add agentic execution mode for AI personas (#38230 )

2026-03-05 15:06:54 +11:00

菲码源库

快速导航

产品服务

生态伙伴