171 lines
9.8 KiB
Markdown
171 lines
9.8 KiB
Markdown
## Table of Contents
|
|
|
|
- [Features](#features)
|
|
- [Supported Models](#supported-models)
|
|
- [Supported Training Approaches](#supported-training-approaches)
|
|
- [Provided Datasets](#provided-datasets)
|
|
- [Requirement](#requirement)
|
|
- [Getting Started](#getting-started)
|
|
- [Projects using LLaMA Factory](#projects-using-llama-factory)
|
|
- [License](#license)
|
|
- [Citation](#citation)
|
|
- [Acknowledgement](#acknowledgement)
|
|
|
|
## Features
|
|
|
|
- **Various models**: LLaMA, LLaVA, Mistral, Mixtral-MoE, Qwen, Qwen2-VL, Yi, Gemma, Baichuan, ChatGLM, Phi, etc.
|
|
- **Integrated methods**: (Continuous) pre-training, (multimodal) supervised fine-tuning, reward modeling, PPO, DPO, KTO, ORPO, etc.
|
|
- **Scalable resources**: 16-bit full-tuning, freeze-tuning, LoRA and 2/3/4/5/6/8-bit QLoRA via AQLM/AWQ/GPTQ/LLM.int8/HQQ/EETQ.
|
|
- **Advanced algorithms**: [GaLore](https://github.com/jiaweizzhao/GaLore), [BAdam](https://github.com/Ledzy/BAdam), [Adam-mini](https://github.com/zyushun/Adam-mini), DoRA, LongLoRA, LLaMA Pro, Mixture-of-Depths, LoRA+, LoftQ, PiSSA and Agent tuning.
|
|
- **Practical tricks**: [FlashAttention-2](https://github.com/Dao-AILab/flash-attention), [Unsloth](https://github.com/unslothai/unsloth), [Liger Kernel](https://github.com/linkedin/Liger-Kernel), RoPE scaling, NEFTune and rsLoRA.
|
|
- **Experiment monitors**: LlamaBoard, TensorBoard, Wandb, MLflow, etc.
|
|
- **Faster inference**: OpenAI-style API, Gradio UI and CLI with vLLM worker.
|
|
|
|
## Supported Models
|
|
|
|
| Model | Model size | Template |
|
|
| --------------------------------------------------------------- | -------------------------------- | ---------------- |
|
|
| [Baichuan 2](https://huggingface.co/baichuan-inc) | 7B/13B | baichuan2 |
|
|
| [BLOOM/BLOOMZ](https://huggingface.co/bigscience) | 560M/1.1B/1.7B/3B/7.1B/176B | - |
|
|
| [ChatGLM3](https://huggingface.co/THUDM) | 6B | chatglm3 |
|
|
| [Command R](https://huggingface.co/CohereForAI) | 35B/104B | cohere |
|
|
| [DeepSeek (Code/MoE)](https://huggingface.co/deepseek-ai) | 7B/16B/67B/236B | deepseek |
|
|
| [Falcon](https://huggingface.co/tiiuae) | 7B/11B/40B/180B | falcon |
|
|
| [Gemma/Gemma 2/CodeGemma](https://huggingface.co/google) | 2B/7B/9B/27B | gemma |
|
|
| [GLM-4](https://huggingface.co/THUDM) | 9B | glm4 |
|
|
| [Index](https://huggingface.co/IndexTeam) | 1.9B | index |
|
|
| [InternLM2/InternLM2.5](https://huggingface.co/internlm) | 7B/20B | intern2 |
|
|
| [Llama](https://github.com/facebookresearch/llama) | 7B/13B/33B/65B | - |
|
|
| [Llama 2](https://huggingface.co/meta-llama) | 7B/13B/70B | llama2 |
|
|
| [Llama 3-3.2](https://huggingface.co/meta-llama) | 1B/3B/8B/70B | llama3 |
|
|
| [Llama 3.2 Vision](https://huggingface.co/meta-llama) | 11B/90B | mllama |
|
|
| [LLaVA-1.5](https://huggingface.co/llava-hf) | 7B/13B | llava |
|
|
| [LLaVA-NeXT](https://huggingface.co/llava-hf) | 7B/8B/13B/34B/72B/110B | llava_next |
|
|
| [LLaVA-NeXT-Video](https://huggingface.co/llava-hf) | 7B/34B | llava_next_video |
|
|
| [MiniCPM](https://huggingface.co/openbmb) | 1B/2B/4B | cpm/cpm3 |
|
|
| [Mistral/Mixtral](https://huggingface.co/mistralai) | 7B/8x7B/8x22B | mistral |
|
|
| [OLMo](https://huggingface.co/allenai) | 1B/7B | - |
|
|
| [PaliGemma](https://huggingface.co/google) | 3B | paligemma |
|
|
| [Phi-1.5/Phi-2](https://huggingface.co/microsoft) | 1.3B/2.7B | - |
|
|
| [Phi-3](https://huggingface.co/microsoft) | 4B/14B | phi |
|
|
| [Phi-3-small](https://huggingface.co/microsoft) | 7B | phi_small |
|
|
| [Pixtral](https://huggingface.co/mistralai) | 12B | pixtral |
|
|
| [Qwen/QwQ (1-2.5) (Code/Math/MoE)](https://huggingface.co/Qwen) | 0.5B/1.5B/3B/7B/14B/32B/72B/110B | qwen |
|
|
| [Qwen2-VL](https://huggingface.co/Qwen) | 2B/7B/72B | qwen2_vl |
|
|
| [Skywork o1](https://huggingface.co/Skywork) | 8B | skywork_o1 |
|
|
| [StarCoder 2](https://huggingface.co/bigcode) | 3B/7B/15B | - |
|
|
| [XVERSE](https://huggingface.co/xverse) | 7B/13B/65B | xverse |
|
|
| [Yi/Yi-1.5 (Code)](https://huggingface.co/01-ai) | 1.5B/6B/9B/34B | yi |
|
|
| [Yi-VL](https://huggingface.co/01-ai) | 6B/34B | yi_vl |
|
|
| [Yuan 2](https://huggingface.co/IEITYuan) | 2B/51B/102B | yuan |
|
|
|
|
> [!NOTE]
|
|
> For the "base" models, the `template` argument can be chosen from `default`, `alpaca`, `vicuna` etc. But make sure to use the **corresponding template** for the "instruct/chat" models.
|
|
>
|
|
> Remember to use the **SAME** template in training and inference.
|
|
|
|
Please refer to [constants.py](src/llamafactory/extras/constants.py) for a full list of models we supported.
|
|
|
|
You also can add a custom chat template to [template.py](src/llamafactory/data/template.py).
|
|
|
|
## Supported Training Approaches
|
|
|
|
| Approach | Full-tuning | Freeze-tuning | LoRA | QLoRA |
|
|
| ---------------------- | ------------------ | ------------------ | ------------------ | ------------------ |
|
|
| Pre-Training | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
|
|
| Supervised Fine-Tuning | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
|
|
| Reward Modeling | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
|
|
| PPO Training | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
|
|
| DPO Training | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
|
|
| KTO Training | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
|
|
| ORPO Training | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
|
|
| SimPO Training | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
|
|
|
|
> [!TIP]
|
|
> The implementation details of PPO can be found in [this blog](https://newfacade.github.io/notes-on-reinforcement-learning/17-ppo-trl.html).
|
|
|
|
Some datasets require confirmation before using them, so we recommend logging in with your Hugging Face account using these commands.
|
|
|
|
```bash
|
|
pip install --upgrade huggingface_hub
|
|
huggingface-cli login
|
|
```
|
|
|
|
## Requirement
|
|
|
|
| Mandatory | Minimum | Recommend |
|
|
| ------------ | ------- | --------- |
|
|
| python | 3.8 | 3.11 |
|
|
| torch | 1.13.1 | 2.4.0 |
|
|
| transformers | 4.41.2 | 4.43.4 |
|
|
| datasets | 2.16.0 | 2.20.0 |
|
|
| accelerate | 0.30.1 | 0.32.0 |
|
|
| peft | 0.11.1 | 0.12.0 |
|
|
| trl | 0.8.6 | 0.9.6 |
|
|
|
|
| Optional | Minimum | Recommend |
|
|
| ------------ | ------- | --------- |
|
|
| CUDA | 11.6 | 12.2 |
|
|
| deepspeed | 0.10.0 | 0.14.0 |
|
|
| bitsandbytes | 0.39.0 | 0.43.1 |
|
|
| vllm | 0.4.3 | 0.5.0 |
|
|
| flash-attn | 2.3.0 | 2.6.3 |
|
|
|
|
### Hardware Requirement
|
|
|
|
\* _estimated_
|
|
|
|
| Method | Bits | 7B | 13B | 30B | 70B | 110B | 8x7B | 8x22B |
|
|
| ----------------- | ---- | ----- | ----- | ----- | ------ | ------ | ----- | ------ |
|
|
| Full | AMP | 120GB | 240GB | 600GB | 1200GB | 2000GB | 900GB | 2400GB |
|
|
| Full | 16 | 60GB | 120GB | 300GB | 600GB | 900GB | 400GB | 1200GB |
|
|
| Freeze | 16 | 20GB | 40GB | 80GB | 200GB | 360GB | 160GB | 400GB |
|
|
| LoRA/GaLore/BAdam | 16 | 16GB | 32GB | 64GB | 160GB | 240GB | 120GB | 320GB |
|
|
| QLoRA | 8 | 10GB | 20GB | 40GB | 80GB | 140GB | 60GB | 160GB |
|
|
| QLoRA | 4 | 6GB | 12GB | 24GB | 48GB | 72GB | 30GB | 96GB |
|
|
| QLoRA | 2 | 4GB | 8GB | 16GB | 24GB | 48GB | 18GB | 48GB |
|
|
|
|
## Getting Started
|
|
|
|
### Build Docker
|
|
|
|
For CUDA users:
|
|
|
|
```bash
|
|
cd docker/docker-cuda/
|
|
docker compose up -d
|
|
docker compose exec llamafactory bash
|
|
```
|
|
|
|
### Installation
|
|
|
|
> [!IMPORTANT]
|
|
> Installation is mandatory.
|
|
|
|
```bash
|
|
git clone --depth 1 http://172.16.10.175:2230/kyy/llm_trainer.git
|
|
cd llm_trainer
|
|
pip install -e ".[torch,metrics]"
|
|
```
|
|
|
|
Extra dependencies available: torch, torch-npu, metrics, deepspeed, liger-kernel, bitsandbytes, hqq, eetq, gptq, awq, aqlm, vllm, galore, badam, adam-mini, qwen, modelscope, openmind, quality
|
|
|
|
### Data Preparation
|
|
|
|
You can either use datasets on HuggingFace / ModelScope / Modelers hub or load the dataset in local disk.
|
|
|
|
> [!NOTE]
|
|
> Please update `data/dataset_info.json` to use your custom dataset.
|
|
|
|
### SFT Start
|
|
|
|
```bash
|
|
sh run_train/run_sft.sh
|
|
```
|
|
|
|
### PT Start
|
|
|
|
```bash
|
|
sh run_train/run_pt.sh
|
|
```
|