multireward-grpo
0.1.0Decoupled & conditioned multi-reward GRPO advantage estimators, a generalized trainer, and the Theorem-3 verification harness from the paper 'When and Why Decoupling and Conditioning Beat Reweighting in Multi-Reward GRPO'.
Published
June 23, 2026
10d ago
Package Registry
README badge Customize →
License Sources Match
MIT confirmed by 2 independent sources — Python registry metadata and the LICENSE file in the package source — as of June 23, 2026.
| Source | License | Class |
|---|---|---|
Licensie (detected) | MIT | Permissive |
PyPI (reported) | MIT | Permissive |
Loading dependencies…
License File
Added Removed Expected
Versions
1 version| Version | License | Published | Status |
|---|---|---|---|
| 0.1.0 Latest Viewing | MIT | Jun 23, 2026 | Scanned |