## Table of Contents - [Supported Models](#supported-models) - [Supported Training Approaches](#supported-training-approaches) - [Requirement](#requirement) - [Getting Started](#getting-started) ## Supported Models | Model | Model size | Template | | --------------------------------------------------------------- | -------------------------------- | ---------------- | | [Baichuan 2](https://huggingface.co/baichuan-inc) | 7B/13B | baichuan2 | | [BLOOM/BLOOMZ](https://huggingface.co/bigscience) | 560M/1.1B/1.7B/3B/7.1B/176B | - | | [ChatGLM3](https://huggingface.co/THUDM) | 6B | chatglm3 | | [Command R](https://huggingface.co/CohereForAI) | 35B/104B | cohere | | [DeepSeek (Code/MoE)](https://huggingface.co/deepseek-ai) | 7B/16B/67B/236B | deepseek | | [Falcon](https://huggingface.co/tiiuae) | 7B/11B/40B/180B | falcon | | [Gemma/Gemma 2/CodeGemma](https://huggingface.co/google) | 2B/7B/9B/27B | gemma | | [GLM-4](https://huggingface.co/THUDM) | 9B | glm4 | | [Index](https://huggingface.co/IndexTeam) | 1.9B | index | | [InternLM2/InternLM2.5](https://huggingface.co/internlm) | 7B/20B | intern2 | | [Llama](https://github.com/facebookresearch/llama) | 7B/13B/33B/65B | - | | [Llama 2](https://huggingface.co/meta-llama) | 7B/13B/70B | llama2 | | [Llama 3-3.2](https://huggingface.co/meta-llama) | 1B/3B/8B/70B | llama3 | | [Llama 3.2 Vision](https://huggingface.co/meta-llama) | 11B/90B | mllama | | [LLaVA-1.5](https://huggingface.co/llava-hf) | 7B/13B | llava | | [LLaVA-NeXT](https://huggingface.co/llava-hf) | 7B/8B/13B/34B/72B/110B | llava_next | | [LLaVA-NeXT-Video](https://huggingface.co/llava-hf) | 7B/34B | llava_next_video | | [MiniCPM](https://huggingface.co/openbmb) | 1B/2B/4B | cpm/cpm3 | | [Mistral/Mixtral](https://huggingface.co/mistralai) | 7B/8x7B/8x22B | mistral | | [OLMo](https://huggingface.co/allenai) | 1B/7B | - | | [PaliGemma](https://huggingface.co/google) | 3B | paligemma | | [Phi-1.5/Phi-2](https://huggingface.co/microsoft) | 1.3B/2.7B | - | | [Phi-3](https://huggingface.co/microsoft) | 4B/14B | phi | | [Phi-3-small](https://huggingface.co/microsoft) | 7B | phi_small | | [Pixtral](https://huggingface.co/mistralai) | 12B | pixtral | | [Qwen/QwQ (1-2.5) (Code/Math/MoE)](https://huggingface.co/Qwen) | 0.5B/1.5B/3B/7B/14B/32B/72B/110B | qwen | | [Qwen2-VL](https://huggingface.co/Qwen) | 2B/7B/72B | qwen2_vl | | [Skywork o1](https://huggingface.co/Skywork) | 8B | skywork_o1 | | [StarCoder 2](https://huggingface.co/bigcode) | 3B/7B/15B | - | | [XVERSE](https://huggingface.co/xverse) | 7B/13B/65B | xverse | | [Yi/Yi-1.5 (Code)](https://huggingface.co/01-ai) | 1.5B/6B/9B/34B | yi | | [Yi-VL](https://huggingface.co/01-ai) | 6B/34B | yi_vl | | [Yuan 2](https://huggingface.co/IEITYuan) | 2B/51B/102B | yuan | ## Supported Training Approaches | Approach | Full-tuning | Freeze-tuning | LoRA | QLoRA | | ---------------------- | ------------------ | ------------------ | ------------------ | ------------------ | | Pre-Training | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | | Supervised Fine-Tuning | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | | Reward Modeling | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | | PPO Training | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | | DPO Training | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | | KTO Training | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | | ORPO Training | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | | SimPO Training | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | > [!Tip] > 일부 모델델은 사용 전에 승인이 필요하므로, Hugging Face 계정으로 로그인하는 것을 추천드립니다. ```bash pip install --upgrade huggingface_hub huggingface-cli login ``` ## Requirement | Mandatory | Minimum | Recommend | | ------------ | ------- | --------- | | python | 3.8 | 3.11 | | torch | 1.13.1 | 2.4.0 | | transformers | 4.41.2 | 4.43.4 | | datasets | 2.16.0 | 2.20.0 | | accelerate | 0.30.1 | 0.32.0 | | peft | 0.11.1 | 0.12.0 | | trl | 0.8.6 | 0.9.6 | | Optional | Minimum | Recommend | | ------------ | ------- | --------- | | CUDA | 11.6 | 12.2 | | deepspeed | 0.10.0 | 0.14.0 | | bitsandbytes | 0.39.0 | 0.43.1 | | vllm | 0.4.3 | 0.5.0 | | flash-attn | 2.3.0 | 2.6.3 | ### Hardware Requirement \* _estimated_ | Method | Bits | 7B | 13B | 30B | 70B | 110B | 8x7B | 8x22B | | ----------------- | ---- | ----- | ----- | ----- | ------ | ------ | ----- | ------ | | Full | AMP | 120GB | 240GB | 600GB | 1200GB | 2000GB | 900GB | 2400GB | | Full | 16 | 60GB | 120GB | 300GB | 600GB | 900GB | 400GB | 1200GB | | Freeze | 16 | 20GB | 40GB | 80GB | 200GB | 360GB | 160GB | 400GB | | LoRA/GaLore/BAdam | 16 | 16GB | 32GB | 64GB | 160GB | 240GB | 120GB | 320GB | | QLoRA | 8 | 10GB | 20GB | 40GB | 80GB | 140GB | 60GB | 160GB | | QLoRA | 4 | 6GB | 12GB | 24GB | 48GB | 72GB | 30GB | 96GB | | QLoRA | 2 | 4GB | 8GB | 16GB | 24GB | 48GB | 18GB | 48GB | ## Getting Started ### Build Docker For CUDA users: ```bash cd docker/docker-cuda/ docker compose up -d docker compose exec llamafactory bash ``` ### Installation > [!IMPORTANT] > Installation is mandatory. ```bash git clone --depth 1 http://172.16.10.175:2230/kyy/llm_trainer.git cd llm_trainer pip install -e ".[torch,metrics]" ``` Extra dependencies available: torch, torch-npu, metrics, deepspeed, liger-kernel, bitsandbytes, hqq, eetq, gptq, awq, aqlm, vllm, galore, badam, adam-mini, qwen, modelscope, openmind, quality ### Data Preparation HuggingFace, ModelScope, Modelers 허브에서 제공하는 데이터셋을 사용하거나, 로컬 디스크에서 데이터셋을 로드할 수 있습니다. > [!NOTE] > Please update `data/dataset_info.json` to use your custom dataset. ### SFT Start ```bash sh run_train/run_sft.sh ``` ### PT Start ```bash sh run_train/run_pt.sh ```