Across multiple coding agents and LLMs, we find that context files tend to reduce task success rates compared to providing no repository context, while also increasing inference cost by over 20%. Behaviorally, both LLM-generated and developer-provided context files encourage broader exploration (e.g., more thorough testing and file traversal), and coding agents tend to respect their instructions. Ultimately, we conclude that unnecessary requirements from context files make tasks harder, and human-written context files should describe only minimal requirements.
Surprisingly, we observe that developer-provided files only marginally improve performance compared to omitting them entirely (an increase of 4% on average), while LLMgenerated context files have a small negative effect on agent performance (a decrease of 3% on average). These observations are robust across different LLMs and prompts used to generate the context files. In a more detailed analysis, we observe that context files lead to increased exploration, testing, and reasoning by coding agents, and, as a result, increase costs by over 20%.
You had one frickin job and you let the LLM do that one too.
This makes me wonder whether we ever did these studies on humans.
"New research suggests that managers providing training to new employees adds cost by 20% and reduces task performance vs letting employees figure it out on their own"
The best strategy has always been
management by mushroom. With an AI this works even better than with humans because if the AI whines it obviously learned too much and you just delete all context. This is why I don't even let it keep context anymore.they read our discussions then teleport back in time a week to build a paper on what we said, breh 😂