When collecting text for vectorizing a topic, we iterate over as many posts as possible within the context window, parsing their cooked attribute using Nokogiri. We noticed this method doesn't scale well when working with larger contexts. Instead, we'll collect as much unparsed cooked text as we can, then parse it all in a single Nokogiri call. I ran this a hundred times in a benchmark, and the perf gains are significant: ``` user system total real prepare_target_text: 114.887620 3.731693 118.619313 (118.952465) prepare_target_text_bis: 10.264950 0.186204 10.451154 ( 10.465957) ``` Tried running it 1k times, but the old method took too long. |
||
|---|---|---|
| .. | ||
| admin/assets/javascripts/discourse | ||
| app | ||
| assets | ||
| config | ||
| db | ||
| discourse_automation | ||
| evals | ||
| lib | ||
| public/ai-share | ||
| spec | ||
| svg-icons | ||
| test/javascripts | ||
| .prettierignore | ||
| about.json | ||
| plugin.rb | ||
| README.md | ||
Discourse AI Plugin
Plugin Summary
For more information, please see: https://meta.discourse.org/t/discourse-ai/259214?u=falco
Evals
The directory evals contains AI evals for the Discourse AI plugin.
You may create a local config by copying config/eval-llms.yml to config/eval-llms.local.yml and modifying the values.
To run them use:
cd evals ./run --help
Usage: evals/run [options]
-e, --eval NAME Name of the evaluation to run
--list-models List models
-m, --model NAME Model to evaluate (will eval all models if not specified)
-l, --list List evals
To run evals you will need to configure API keys in your environment:
OPENAI_API_KEY=your_openai_api_key ANTHROPIC_API_KEY=your_anthropic_api_key GEMINI_API_KEY=your_gemini_api_key