Open Source Framework · Powerful & Easy
Reinforcement Learning
Optimization
for Large-scale
Learning
An open-source reinforcement learning library by Alibaba, optimized for large-scale language models. Supporting distributed training, multi-task learning, and agent interaction for simpler and more efficient AI model training.

ROLL
Framework Overview
ROLL (Reinforcement Learning Optimization for Large-scale Learning) is an open-source reinforcement learning framework by Alibaba, designed for large-scale language models. Built on Ray distributed architecture, supporting mainstream algorithms like PPO and GRPO, providing complete solutions from research to production.
1.9k
Github Stars
30+
Contributors
200+
Commits
Why
Choose ROLL

Core Advantages
ROLL framework provides comprehensive reinforcement learning support, from model training to agent deployment, every aspect is carefully optimized to make AI training more efficient
Born for Scale
Built on a Ray-based distributed architecture, it supports large-scale cluster training at the thousand-GPU level. Its innovative Rollout scheduler and AutoDeviceMapping module dramatically improve GPU resource utilization .
Extreme Training Efficiency
Integrates cutting-edge technologies like Megatron-Core, SGLang, and vLLM to significantly accelerate the model training and inference sampling processes .
Rich Algorithms & Scenarios
Comes with built-in mainstream RL algorithms like PPO and GRPO, and supports multi-task RL and agent interaction scenarios. Its effectiveness has been validated in numerous real-world business applications .
Open Source and Accessible
ROLL is open-sourced on GitHub (https://github.com/alibaba/ROLL) under the Apache License 2.0, backed by an active community and comprehensive documentation .
Open Source Community
Join our vibrant open source community, explore cutting-edge reinforcement learning technologies with global AI researchers, and jointly promote the future of LLM and RL
How to Contribute
Contribute algorithm implementations and performance optimizations
Share experimental results and best practices
Improve tutorials and learning resources
Join Discussion