OpenCompass v0.5.1 Release Notes

🌟 Highlights

✨ A New Method to quickly Integrate and Evaluate Your Datasets: Added a fast dataset integration and evaluation method based on ChatML format, simplifying the previously complex dataset integration process.
✨ New Datasets: Integrated new benchmarks including SeedBench and BeyondAIME.
✨ Infrastructure & Enhancements: Fixed several bugs and updated CI.

🚀 New Features

🔧 Introduced a new approach for dataset integration and evaluation based on ChatML Template with evaluation examples (#2277).
🔧 Added SeedBench dataset (#2020).
🔧 Added BeyondAIME dataset (#2192).

🐛 Bug Fixes

🔧 Fixed Module Registers(#2262, #2266)
🔧 Fixed duplicate engine config update in TurboMindModelwithChatTemplate (#2276)
🔧 Fixed torchrun to avoid unexpected PATH environment (#2269)
🔧 Fixed the None return value case in cascade evaluator (#2211)

⚙ Enhancements and Refactors

⚙ Infrastructure Refactors:

Update rjob.py and subjective_eval.py (#2263)

⚙ CI/CD Improvements:

Updated testcase (#2257)
Updated pr_test (#2281)
Fixed pr_test installation (#2290)

🎉 Welcome New Contributors

A warm welcome and special thanks to our newest contributors who made this release possible:

@ChenZiHong-Gavin for their first contribution in (#2020)
@jibuzixin for their first contribution in (#2211)
@JYLiAo32 for their first contribution in (#2276)
@ShoupingShan for their first contribution in (#2262)

Full Changelog: 0.5.0...0.5.1.post1

Thank you for using OpenCompass! These updates empower deeper insights and more reliable evaluations. Keep exploring, and stay tuned for future innovations! 🌟

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

0.5.1.post1

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

OpenCompass v0.5.1 Release Notes

🌟 Highlights

🚀 New Features

🐛 Bug Fixes

⚙ Enhancements and Refactors

🎉 Welcome New Contributors

Contributors

Uh oh!