mirror of
https://gh.wpcy.net/https://github.com/discourse/discourse.git
synced 2026-06-19 05:59:26 +08:00
## Summary Adds a new `aws_bedrock_converse` inference provider that uses the official AWS SDK (`aws-sdk-bedrockruntime`) and the Converse API. This runs alongside the existing `aws_bedrock` provider — fully additive, zero risk to existing configurations. ### Why a new provider? The existing `aws_bedrock` provider manually handles SigV4 signing, URL construction, binary event stream decoding, and maintains a hardcoded model ID mapping table. It only supports Claude and Nova models. The new provider delegates all of this to the official AWS SDK, which means: - **Model-agnostic** — works with any model available on Bedrock (Claude, Nova, Kimi, MiniMax, Mistral, Llama, DeepSeek, NVIDIA, Qwen, GLM, etc.) without any model-specific code - **Application Inference Profiles** — users can set cross-region profiles (`us.anthropic.claude-sonnet-4-20250514-v1:0`) or application inference profile ARNs directly as the model name - **Bedrock API Key auth** — supports the new AWS Bedrock API keys (Bearer token auth) in addition to IAM access keys, STS role assumption, and automatic credential resolution from environment/instance profiles - **No maintenance burden** — no model ID mapping table to update when AWS adds new models, no manual SigV4 signing, no binary event stream decoding - **Native tools only** — no XML tool fallback; uses the Converse API's built-in tool support ### Authentication options (priority order) | Config | Auth method | |---|---| | `role_arn` set | STS AssumeRole (SigV4) | | `access_key_id` set | Static IAM credentials (SigV4) | | API key set (no access_key_id/role_arn) | Bearer token (Bedrock API key) | | Nothing set | SDK auto-resolves (env vars, instance profile, ECS task role) | ### Features supported - Streaming and non-streaming completions - Native tool use with tool_choice (auto/any/specific tool) - Structured output via Converse API's `output_config` (models that support it) - Extended thinking / adaptive thinking with signature preservation for multi-turn - Interleaved thinking with tool calls (thinking blocks preserved per tool_call message) - Prompt caching via `cache_point` blocks - Effort parameter (low/medium/high/max) - `extra_model_fields` provider param for arbitrary `additionalModelRequestFields` (beta features like `anthropic_beta`, 1M context, interleaved thinking) ### New files - `lib/completions/endpoints/aws_bedrock_converse.rb` — endpoint using `Aws::BedrockRuntime::Client` - `lib/completions/dialects/converse.rb` — unified Converse API dialect - `lib/completions/dialects/converse_tools.rb` — tool formatting - `lib/completions/converse_message_processor.rb` — response processing for SDK typed objects ## Tested against real Bedrock API All tests performed using Bedrock API Key auth (Bearer token) against live endpoints with 9 different models from 8 providers: | Test | Claude Sonnet 4 | Claude Haiku 4.5 | Kimi K2.5 | MiniMax M2 | DeepSeek 3.2 | NVIDIA Nemotron 3 120B | Qwen3 Next 80B | GLM 5 | Mistral Small | |---|---|---|---|---|---|---|---|---|---| | Non-streaming text | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | Streaming text | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | Multi-turn conversation | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | Tool use (non-streaming) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | Tool use (streaming) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ model unsupported | | Structured output (non-streaming) | — | ✅ | ❌ model unsupported | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ model unsupported | | Structured output (streaming) | — | ✅ | ❌ model unsupported | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ model unsupported | | Bearer token auth | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | Cross-region inference profile | ✅ | ✅ | — | — | — | — | — | — | — | | Audit logging + token tracking | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | > **Notes:** > - Claude Sonnet 4 structured output not tested — requires 4.5+ for this feature and those cross-region profiles were not available in the test region. > - Kimi K2.5 and Mistral Small do not support Bedrock's native structured output. > - Mistral Small does not support streaming tool use. > - All ❌ results are model-level limitations, not code issues — the Converse API correctly surfaces the error. ## Test plan - [ ] Existing `aws_bedrock` provider tests pass (`bin/rspec spec/lib/completions/endpoints/aws_bedrock_spec.rb`) - [ ] New provider tests pass (`bin/rspec spec/lib/completions/endpoints/aws_bedrock_converse_spec.rb`) - [ ] Create an LLM model with provider "AWS Bedrock (Converse API)" in admin UI - [ ] Verify basic completion works with a Bedrock API key (just region + API key, no IAM keys needed) - [ ] Verify tool use works in AI bot conversations - [ ] Verify structured output works with a supported model (Claude Haiku 4.5+)
100 lines
2.5 KiB
Ruby
Vendored
100 lines
2.5 KiB
Ruby
Vendored
# frozen_string_literal: true
|
|
|
|
module DiscourseAi
|
|
module Completions
|
|
module Dialects
|
|
class ConverseTools
|
|
def initialize(tools)
|
|
@raw_tools = tools
|
|
end
|
|
|
|
def translated_tools
|
|
return if !@raw_tools.present?
|
|
|
|
{
|
|
tools:
|
|
@raw_tools.map do |tool|
|
|
{
|
|
tool_spec: {
|
|
name: tool.name,
|
|
description: tool.description,
|
|
input_schema: {
|
|
json: deep_stringify(tool.parameters_json_schema),
|
|
},
|
|
},
|
|
}
|
|
end,
|
|
}
|
|
end
|
|
|
|
def from_raw_tool_call(raw_message)
|
|
result = []
|
|
|
|
provider_info = converse_reasoning(raw_message)
|
|
if provider_info.present?
|
|
if raw_message[:thinking] && provider_info[:signature]
|
|
result << {
|
|
reasoning_content: {
|
|
reasoning_text: {
|
|
text: raw_message[:thinking],
|
|
signature: provider_info[:signature],
|
|
},
|
|
},
|
|
}
|
|
end
|
|
|
|
if provider_info[:redacted_content]
|
|
result << {
|
|
reasoning_content: {
|
|
redacted_content: provider_info[:redacted_content],
|
|
},
|
|
}
|
|
end
|
|
end
|
|
|
|
result << {
|
|
tool_use: {
|
|
tool_use_id: raw_message[:id],
|
|
name: raw_message[:name],
|
|
input: JSON.parse(raw_message[:content])["arguments"],
|
|
},
|
|
}
|
|
|
|
result
|
|
end
|
|
|
|
def from_raw_tool(raw_message)
|
|
[
|
|
{
|
|
tool_result: {
|
|
tool_use_id: raw_message[:id],
|
|
content: [{ json: JSON.parse(raw_message[:content]) }],
|
|
},
|
|
},
|
|
]
|
|
end
|
|
|
|
private
|
|
|
|
def deep_stringify(obj)
|
|
case obj
|
|
when Hash
|
|
obj.transform_keys(&:to_s).transform_values { |v| deep_stringify(v) }
|
|
when Array
|
|
obj.map { |v| deep_stringify(v) }
|
|
when Symbol
|
|
obj.to_s
|
|
else
|
|
obj
|
|
end
|
|
end
|
|
|
|
def converse_reasoning(message)
|
|
info = message[:thinking_provider_info]
|
|
return if info.blank?
|
|
info[:bedrock_converse] || info["bedrock_converse"]
|
|
end
|
|
end
|
|
end
|
|
end
|
|
end
|