Converting MCoreAdapter Models to Hugging Face Format
MCoreAdapter provides tools for converting between Megatron(McoreAdapter) and Hugging Face model formats. This document will guide you on how to convert a trained Megatron model to Hugging Face format for use in other projects.
Conversion Tool
MCoreAdapter includes a conversion tool tools/convert.py
that can convert Megatron(McoreAdapter) models to Hugging Face format.
Conversion Command
To convert a model trained with megatron_strategy to a Hugging Face model, use the following command:
python tools/convert.py --checkpoint_path path_to_megatron_model --output_path path_to_output_hf_model
Where:
path_to_megatron_model
is the path to the McoreAdapter model you want to convertpath_to_output_hf_model
is the output path for the converted Hugging Face model
Notes
- The conversion process may take some time depending on the size of the model.
- Ensure you have sufficient disk space to store the converted model.
- The converted Hugging Face model can be used directly in projects that support the Hugging Face Transformers library.
Direct Use of Hugging Face Models
It's worth noting that MCoreAdapter can directly load Hugging Face models without explicitly converting models to Megatron(McoreAdapter) format. This is particularly useful when you want to use Hugging Face models for reinforcement learning in the Roll framework.