Initial commit

This commit is contained in:
kyy
2025-02-19 15:05:26 +09:00
parent f0109c0c10
commit 5f79b26123

View File

@@ -1,25 +1,9 @@
## Table of Contents
- [Features](#features)
- [Supported Models](#supported-models)
- [Supported Training Approaches](#supported-training-approaches)
- [Provided Datasets](#provided-datasets)
- [Requirement](#requirement)
- [Getting Started](#getting-started)
- [Projects using LLaMA Factory](#projects-using-llama-factory)
- [License](#license)
- [Citation](#citation)
- [Acknowledgement](#acknowledgement)
## Features
- **Various models**: LLaMA, LLaVA, Mistral, Mixtral-MoE, Qwen, Qwen2-VL, Yi, Gemma, Baichuan, ChatGLM, Phi, etc.
- **Integrated methods**: (Continuous) pre-training, (multimodal) supervised fine-tuning, reward modeling, PPO, DPO, KTO, ORPO, etc.
- **Scalable resources**: 16-bit full-tuning, freeze-tuning, LoRA and 2/3/4/5/6/8-bit QLoRA via AQLM/AWQ/GPTQ/LLM.int8/HQQ/EETQ.
- **Advanced algorithms**: [GaLore](https://github.com/jiaweizzhao/GaLore), [BAdam](https://github.com/Ledzy/BAdam), [Adam-mini](https://github.com/zyushun/Adam-mini), DoRA, LongLoRA, LLaMA Pro, Mixture-of-Depths, LoRA+, LoftQ, PiSSA and Agent tuning.
- **Practical tricks**: [FlashAttention-2](https://github.com/Dao-AILab/flash-attention), [Unsloth](https://github.com/unslothai/unsloth), [Liger Kernel](https://github.com/linkedin/Liger-Kernel), RoPE scaling, NEFTune and rsLoRA.
- **Experiment monitors**: LlamaBoard, TensorBoard, Wandb, MLflow, etc.
- **Faster inference**: OpenAI-style API, Gradio UI and CLI with vLLM worker.
## Supported Models
@@ -59,15 +43,6 @@
| [Yi-VL](https://huggingface.co/01-ai) | 6B/34B | yi_vl |
| [Yuan 2](https://huggingface.co/IEITYuan) | 2B/51B/102B | yuan |
> [!NOTE]
> For the "base" models, the `template` argument can be chosen from `default`, `alpaca`, `vicuna` etc. But make sure to use the **corresponding template** for the "instruct/chat" models.
>
> Remember to use the **SAME** template in training and inference.
Please refer to [constants.py](src/llamafactory/extras/constants.py) for a full list of models we supported.
You also can add a custom chat template to [template.py](src/llamafactory/data/template.py).
## Supported Training Approaches
| Approach | Full-tuning | Freeze-tuning | LoRA | QLoRA |
@@ -81,10 +56,8 @@ You also can add a custom chat template to [template.py](src/llamafactory/data/t
| ORPO Training | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| SimPO Training | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
> [!TIP]
> The implementation details of PPO can be found in [this blog](https://newfacade.github.io/notes-on-reinforcement-learning/17-ppo-trl.html).
Some datasets require confirmation before using them, so we recommend logging in with your Hugging Face account using these commands.
> [!Tip]
> 일부 모델델은 사용 전에 승인이 필요하므로, Hugging Face 계정으로 로그인하는 것을 추천드립니다.
```bash
pip install --upgrade huggingface_hub
@@ -152,7 +125,7 @@ Extra dependencies available: torch, torch-npu, metrics, deepspeed, liger-kernel
### Data Preparation
You can either use datasets on HuggingFace / ModelScope / Modelers hub or load the dataset in local disk.
HuggingFace, ModelScope, Modelers 허브에서 제공하는 데이터셋을 사용하거나, 로컬 디스크에서 데이터셋을 로드할 수 있습니다.
> [!NOTE]
> Please update `data/dataset_info.json` to use your custom dataset.