code-conductor-bench
A benchmark based on swe-bench that evaluates the conceptual reasoning capabilities of LLMs in the context of software engineering tasks.
- Latest release
- Aug 31, 2025
- Releases
- 5
- Known CVEs
- 0
- First release
- Aug 31, 2025
- License
- —
Insights
Activity
- Total releases
- 5
- Last 12 months
- 5
- Cadence
- ~daily
- Dependencies
- 10
Releases per month
last 12 monthsRelease mix
- patch 1
- pre 3
5
releases
Dependencies
Depends on
0.1.6-
datasets >=4.0.0
-
jinja2 >=3.1.6
-
loguru >=0.7.3
-
mini-swe-agent
-
openai >=1.102.0
-
openai-agents >=0.2.10
-
swebench
-
tenacity >=9.1.2
-
tqdm >=4.67.1
-
wandb >=0.21.3
Used by
Nothing tracked depends on this yet.
Releases
| Version | Released | |
|---|---|---|
0.1.6
patch
|
0.1.6
patch
Dependencies (10)
+ 2 more |
|
0.1.5.dev2
pre
|
0.1.5.dev2
pre
|
|
0.1.5.dev1
pre
|
0.1.5.dev1
pre
|
|
0.1.1.dev1
pre
|
0.1.1.dev1
pre
|
|
0.1.0
initial
|
0.1.0
initial
|