mirror of
https://gh.wpcy.net/https://github.com/discourse/discourse.git
synced 2026-05-07 00:37:22 +08:00
## Summary Follow-up to #39634. Adds `OdtToText` and `OdsToText` converters so OpenDocument text (`.odt`) and spreadsheet (`.ods`) attachments can be embedded as text in LLM prompts, in line with the newly added DOCX/XLSX support. Both formats are zip archives with a single `content.xml`, so they reuse `Compression::SafeZipReader` and the bounded Nokogiri parsing pattern from #39634 — no new external binaries. - `OdtToText` walks the body's block-level children (paragraphs, headings, lists, tables, frames, sections) and renders nested lists with depth-aware bullet prefixes. Tables become tab-separated rows. - `OdsToText` iterates sheets and rows, expanding `table:number-columns-repeated` up to `MAX_COLUMNS` to avoid expansion bombs from sparse trailing cells, and falls back to `office:value` / `office:date-value` / `office:boolean-value` when no inline `<text:p>` is present. - `UploadEncoder.attachment_type_for` and `encode_document` dispatch gain `odt` and `ods` cases. - `ai-llm-attachment-types` `DEFAULT_CHOICES` lists `odt` next to `docx` and `ods` next to `xlsx`. ## Test plan - [x] `bin/rspec plugins/discourse-ai/spec/lib/completions/odt_to_text_spec.rb` — 6 cases - [x] `bin/rspec plugins/discourse-ai/spec/lib/completions/ods_to_text_spec.rb` — 6 cases - [x] `bin/rspec plugins/discourse-ai/spec/lib/completions/upload_encoder_spec.rb` — full encoder suite incl. 4 new ODT/ODS integration cases - [x] `bin/lint` clean across all touched files - [ ] Manual smoke: upload a real `.odt` and `.ods` to a topic, assign an LLM with the new attachment types allowed, and verify the extracted text appears in the prompt |
||
|---|---|---|
| .. | ||
| agents | ||
| ai_bot | ||
| ai_helper | ||
| ai_moderation | ||
| ai_tool_scripts | ||
| automation | ||
| completions | ||
| configuration | ||
| database | ||
| discord/bot | ||
| discover | ||
| embeddings | ||
| inference | ||
| inferred_concepts | ||
| mcp | ||
| sentiment | ||
| summarization | ||
| tasks | ||
| translation | ||
| utils | ||
| ai_bot.rb | ||
| automation.rb | ||
| embeddings.rb | ||
| engine.rb | ||
| guardian_extensions.rb | ||
| multisite_hash.rb | ||
| post_extensions.rb | ||
| summarization.rb | ||
| topic_extensions.rb | ||
| translation.rb | ||