discourse/plugins/discourse-ai/lib/completions/dialects/converse_tools.rb
Rafael dos Santos Silva 18a0a8daeb
FEATURE: Add AWS Bedrock Converse API provider (#38903)
## Summary

Adds a new `aws_bedrock_converse` inference provider that uses the
official AWS SDK (`aws-sdk-bedrockruntime`) and the Converse API. This
runs alongside the existing `aws_bedrock` provider — fully additive,
zero risk to existing configurations.

### Why a new provider?

The existing `aws_bedrock` provider manually handles SigV4 signing, URL
construction, binary event stream decoding, and maintains a hardcoded
model ID mapping table. It only supports Claude and Nova models.

The new provider delegates all of this to the official AWS SDK, which
means:

- **Model-agnostic** — works with any model available on Bedrock
(Claude, Nova, Kimi, MiniMax, Mistral, Llama, DeepSeek, NVIDIA, Qwen,
GLM, etc.) without any model-specific code
- **Application Inference Profiles** — users can set cross-region
profiles (`us.anthropic.claude-sonnet-4-20250514-v1:0`) or application
inference profile ARNs directly as the model name
- **Bedrock API Key auth** — supports the new AWS Bedrock API keys
(Bearer token auth) in addition to IAM access keys, STS role assumption,
and automatic credential resolution from environment/instance profiles
- **No maintenance burden** — no model ID mapping table to update when
AWS adds new models, no manual SigV4 signing, no binary event stream
decoding
- **Native tools only** — no XML tool fallback; uses the Converse API's
built-in tool support

### Authentication options (priority order)

| Config | Auth method |
|---|---|
| `role_arn` set | STS AssumeRole (SigV4) |
| `access_key_id` set | Static IAM credentials (SigV4) |
| API key set (no access_key_id/role_arn) | Bearer token (Bedrock API
key) |
| Nothing set | SDK auto-resolves (env vars, instance profile, ECS task
role) |

### Features supported

- Streaming and non-streaming completions
- Native tool use with tool_choice (auto/any/specific tool)
- Structured output via Converse API's `output_config` (models that
support it)
- Extended thinking / adaptive thinking with signature preservation for
multi-turn
- Interleaved thinking with tool calls (thinking blocks preserved per
tool_call message)
- Prompt caching via `cache_point` blocks
- Effort parameter (low/medium/high/max)
- `extra_model_fields` provider param for arbitrary
`additionalModelRequestFields` (beta features like `anthropic_beta`, 1M
context, interleaved thinking)

### New files

- `lib/completions/endpoints/aws_bedrock_converse.rb` — endpoint using
`Aws::BedrockRuntime::Client`
- `lib/completions/dialects/converse.rb` — unified Converse API dialect
- `lib/completions/dialects/converse_tools.rb` — tool formatting
- `lib/completions/converse_message_processor.rb` — response processing
for SDK typed objects

## Tested against real Bedrock API

All tests performed using Bedrock API Key auth (Bearer token) against
live endpoints with 9 different models from 8 providers:

| Test | Claude Sonnet 4 | Claude Haiku 4.5 | Kimi K2.5 | MiniMax M2 |
DeepSeek 3.2 | NVIDIA Nemotron 3 120B | Qwen3 Next 80B | GLM 5 | Mistral
Small |
|---|---|---|---|---|---|---|---|---|---|
| Non-streaming text |  |  |
 |  |  |
 |  |  |
 |
| Streaming text |  |  |
 |  |  |
 |  |  |
 |
| Multi-turn conversation |  |  |
 |  |  |
 |  |  |
 |
| Tool use (non-streaming) |  |  |
 |  |  |
 |  |  |
 |
| Tool use (streaming) |  |  |
 |  |  |
 |  |  |  model
unsupported |
| Structured output (non-streaming) | — |  |  model
unsupported |  |  |
 |  |  |  model
unsupported |
| Structured output (streaming) | — |  |  model
unsupported |  |  |
 |  |  |  model
unsupported |
| Bearer token auth |  |  |
 |  |  |
 |  |  |
 |
| Cross-region inference profile |  |
 | — | — | — | — | — | — | — |
| Audit logging + token tracking |  |
 |  |  |
 |  |  |
 |  |

> **Notes:**
> - Claude Sonnet 4 structured output not tested — requires 4.5+ for
this feature and those cross-region profiles were not available in the
test region.
> - Kimi K2.5 and Mistral Small do not support Bedrock's native
structured output.
> - Mistral Small does not support streaming tool use.
> - All  results are model-level limitations, not code issues — the
Converse API correctly surfaces the error.

## Test plan

- [ ] Existing `aws_bedrock` provider tests pass (`bin/rspec
spec/lib/completions/endpoints/aws_bedrock_spec.rb`)
- [ ] New provider tests pass (`bin/rspec
spec/lib/completions/endpoints/aws_bedrock_converse_spec.rb`)
- [ ] Create an LLM model with provider "AWS Bedrock (Converse API)" in
admin UI
- [ ] Verify basic completion works with a Bedrock API key (just region
+ API key, no IAM keys needed)
- [ ] Verify tool use works in AI bot conversations
- [ ] Verify structured output works with a supported model (Claude
Haiku 4.5+)
2026-03-30 12:37:30 -03:00

100 lines
2.5 KiB
Ruby
Vendored

# frozen_string_literal: true
module DiscourseAi
module Completions
module Dialects
class ConverseTools
def initialize(tools)
@raw_tools = tools
end
def translated_tools
return if !@raw_tools.present?
{
tools:
@raw_tools.map do |tool|
{
tool_spec: {
name: tool.name,
description: tool.description,
input_schema: {
json: deep_stringify(tool.parameters_json_schema),
},
},
}
end,
}
end
def from_raw_tool_call(raw_message)
result = []
provider_info = converse_reasoning(raw_message)
if provider_info.present?
if raw_message[:thinking] && provider_info[:signature]
result << {
reasoning_content: {
reasoning_text: {
text: raw_message[:thinking],
signature: provider_info[:signature],
},
},
}
end
if provider_info[:redacted_content]
result << {
reasoning_content: {
redacted_content: provider_info[:redacted_content],
},
}
end
end
result << {
tool_use: {
tool_use_id: raw_message[:id],
name: raw_message[:name],
input: JSON.parse(raw_message[:content])["arguments"],
},
}
result
end
def from_raw_tool(raw_message)
[
{
tool_result: {
tool_use_id: raw_message[:id],
content: [{ json: JSON.parse(raw_message[:content]) }],
},
},
]
end
private
def deep_stringify(obj)
case obj
when Hash
obj.transform_keys(&:to_s).transform_values { |v| deep_stringify(v) }
when Array
obj.map { |v| deep_stringify(v) }
when Symbol
obj.to_s
else
obj
end
end
def converse_reasoning(message)
info = message[:thinking_provider_info]
return if info.blank?
info[:bedrock_converse] || info["bedrock_converse"]
end
end
end
end
end