← all models · full run →

devstral-small-2

Mistral's Devstral-Small-2 (24 B dense, Apache-2.0). Co-designed with All-Hands for OpenHands agentic loops; Aider community's top local pick for multi-file refactors in 2026.

mistralai/Devstral-Small-2-24B-Instruct-2512 BF16 52 GB 24B active 24B total parser · mistral

Transcripts

simple

Reply with exactly the five words: hello from hermes on spark.
Failed to initialize agent: Model devstral-small-2 has a context window of 32,768 tokens, which is below the minimum 64,000 required by Hermes Agent. Choose a model with at least 64K context, or set model.context_length in config.yaml to override.
category · trivial elapsed · 2.37s exit · 1

math

What is 127 times 49? Answer with just the number.
Failed to initialize agent: Model devstral-small-2 has a context window of 32,768 tokens, which is below the minimum 64,000 required by Hermes Agent. Choose a model with at least 64K context, or set model.context_length in config.yaml to override.
category · reasoning elapsed · 2.68s exit · 1

reasoning

A farmer has 17 sheep. All but 9 die. How many remain? Answer with one short sentence.
Failed to initialize agent: Model devstral-small-2 has a context window of 32,768 tokens, which is below the minimum 64,000 required by Hermes Agent. Choose a model with at least 64K context, or set model.context_length in config.yaml to override.
category · reasoning elapsed · 2.97s exit · 1

tool_ls

Use the shell tool to list files in /tmp. Tell me only how many there are.
Failed to initialize agent: Model devstral-small-2 has a context window of 32,768 tokens, which is below the minimum 64,000 required by Hermes Agent. Choose a model with at least 64K context, or set model.context_length in config.yaml to override.
category · tool-use elapsed · 2.4s exit · 1

code

Write a one-line Python expression that returns the sum of squares from 1 to 10.
Failed to initialize agent: Model devstral-small-2 has a context window of 32,768 tokens, which is below the minimum 64,000 required by Hermes Agent. Choose a model with at least 64K context, or set model.context_length in config.yaml to override.
category · coding elapsed · 2.96s exit · 1