multireward-grpo

0.1.0

Decoupled & conditioned multi-reward GRPO advantage estimators, a generalized trainer, and the Theorem-3 verification harness from the paper 'When and Why Decoupling and Conditioning Beat Reweighting in Multi-Reward GRPO'.

License

MIT

Permissive

A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.

Published

June 23, 2026

10d ago

Package Registry

PyPI Source

README badge Customize →

License Sources Match

MIT confirmed by 2 independent sources — Python registry metadata and the LICENSE file in the package source — as of June 23, 2026.

Source	License	Class
Licensie (detected)	MIT	Permissive
PyPI (reported)	MIT	Permissive

Loading dependencies…

License File

Added Removed Expected

Versions

1 version

Version	License	Published	Status
0.1.0 Latest Viewing	MIT	Jun 23, 2026	Scanned