OpenProduct

convex-evals

Visit site
0
Tracked since 2026-07-01
Share
AI Summary

Convex-evals is a framework for evaluating large language models (LLMs) by generating and scoring adversarial test cases, designed for AI researchers and safety engineers. It systematically probes model weaknesses through automated, multi-turn conversations, making it interesting for its ability to uncover subtle failure modes that standard benchmarks miss.

Cross-platform signals

GH
GitHub
View
123
stars
10
forks
Updated 2026-07-05
convex-evals — OpenProduct