convex-evals
Share
AI Summary
Convex-evals is a framework for evaluating large language models (LLMs) by generating and scoring adversarial test cases, designed for AI researchers and safety engineers. It systematically probes model weaknesses through automated, multi-turn conversations, making it interesting for its ability to uncover subtle failure modes that standard benchmarks miss.
Cross-platform signals
GH
ViewGitHub
123
stars
10
forks
Updated 2026-07-05