discourse/plugins/discourse-ai/evals/lib at dev/purge-widgets - Discourse/discourse - 菲码源库 feiCode.com

Discourse/discourse

mirror of https://gh.wpcy.net/https://github.com/discourse/discourse.git synced 2026-05-23 20:04:04 +08:00

History

Roman Rizzi 3a647c8e50 FEATURE: Use evals to compare LLMs and Personas' prompts (#36027 ) Implemented an eval “comparison matrix” that lets you run the same evals across multiple personas or multiple LLMs and have a judge model declare a winner with per-candidate scores. The CLI adds --compare personas\|llms, keeps persona selection (auto-prepending default for persona mode), and always ensures a judge is configured. A dedicated ComparisonRunner reuses Workbench results to build candidate outputs and sends them to Judge#compare, which crafts a rubric-aware comparison prompt and parses structured winner/ratings JSON. Outputs are streamed to the console and individual run logs still get written. README documents how to use the new flag and what each mode does.		2025-11-18 10:39:52 -03:00
..
prompts	FEATURE: Cover all LLM features with evals (#35693 )	2025-11-13 12:24:56 -03:00
runners	FEATURE: Use evals to compare LLMs and Personas' prompts (#36027 )	2025-11-18 10:39:52 -03:00
boot.rb
cli.rb	FEATURE: Use evals to compare LLMs and Personas' prompts (#36027 )	2025-11-18 10:39:52 -03:00
comparison_runner.rb	FEATURE: Use evals to compare LLMs and Personas' prompts (#36027 )	2025-11-18 10:39:52 -03:00
eval.rb	REFACTOR: centralize eval orchestration around feature-driven playground (#35718 )	2025-10-30 13:08:38 -03:00
features.rb	FEATURE: Use evals to compare LLMs and Personas' prompts (#36027 )	2025-11-18 10:39:52 -03:00
judge.rb	FEATURE: Use evals to compare LLMs and Personas' prompts (#36027 )	2025-11-18 10:39:52 -03:00
llm_repository.rb	FEATURE: Cover all LLM features with evals (#35693 )	2025-11-13 12:24:56 -03:00
persona_prompt_loader.rb	FEATURE: Use evals to compare LLMs and Personas' prompts (#36027 )	2025-11-18 10:39:52 -03:00
recorder.rb	FEATURE: Use evals to compare LLMs and Personas' prompts (#36027 )	2025-11-18 10:39:52 -03:00
structured_logger.rb	REFACTOR: centralize eval orchestration around feature-driven playground (#35718 )	2025-10-30 13:08:38 -03:00
workbench.rb	FEATURE: Use evals to compare LLMs and Personas' prompts (#36027 )	2025-11-18 10:39:52 -03:00

专为开源 Web 生态打造的企业级代码托管平台，深度支持 WordPress、Laravel、Vue.js、React 等主流技术栈，致力于推动中国开放网络 OpenWeb 发展，助力本土开源项目建设。

基于构建 | 专业 • 开放 • 安全

文派开源（WenPai.org）项目官方代码托管平台，由以下企业技术团队联合运营：

汉中菲比斯网络技术有限公司 | 文派（广州）科技有限公司

莫蒂奇数字技术（苏州）有限公司

探索项目组织机构问题反馈开发者社区

代码托管本地化翻译企业服务私有部署

文派叶子薇晓朵 WP TEA 慕得教育麟悦平台 ArkPress 跨飞独立站橙黑设计

Copyright © 2025 菲码源库 feiCode.com. All rights reserved. 陕ICP备15002899号-20