Initial commit
This commit is contained in:
33
README.md
33
README.md
@@ -1,25 +1,9 @@
|
||||
## Table of Contents
|
||||
|
||||
- [Features](#features)
|
||||
- [Supported Models](#supported-models)
|
||||
- [Supported Training Approaches](#supported-training-approaches)
|
||||
- [Provided Datasets](#provided-datasets)
|
||||
- [Requirement](#requirement)
|
||||
- [Getting Started](#getting-started)
|
||||
- [Projects using LLaMA Factory](#projects-using-llama-factory)
|
||||
- [License](#license)
|
||||
- [Citation](#citation)
|
||||
- [Acknowledgement](#acknowledgement)
|
||||
|
||||
## Features
|
||||
|
||||
- **Various models**: LLaMA, LLaVA, Mistral, Mixtral-MoE, Qwen, Qwen2-VL, Yi, Gemma, Baichuan, ChatGLM, Phi, etc.
|
||||
- **Integrated methods**: (Continuous) pre-training, (multimodal) supervised fine-tuning, reward modeling, PPO, DPO, KTO, ORPO, etc.
|
||||
- **Scalable resources**: 16-bit full-tuning, freeze-tuning, LoRA and 2/3/4/5/6/8-bit QLoRA via AQLM/AWQ/GPTQ/LLM.int8/HQQ/EETQ.
|
||||
- **Advanced algorithms**: [GaLore](https://github.com/jiaweizzhao/GaLore), [BAdam](https://github.com/Ledzy/BAdam), [Adam-mini](https://github.com/zyushun/Adam-mini), DoRA, LongLoRA, LLaMA Pro, Mixture-of-Depths, LoRA+, LoftQ, PiSSA and Agent tuning.
|
||||
- **Practical tricks**: [FlashAttention-2](https://github.com/Dao-AILab/flash-attention), [Unsloth](https://github.com/unslothai/unsloth), [Liger Kernel](https://github.com/linkedin/Liger-Kernel), RoPE scaling, NEFTune and rsLoRA.
|
||||
- **Experiment monitors**: LlamaBoard, TensorBoard, Wandb, MLflow, etc.
|
||||
- **Faster inference**: OpenAI-style API, Gradio UI and CLI with vLLM worker.
|
||||
|
||||
## Supported Models
|
||||
|
||||
@@ -59,15 +43,6 @@
|
||||
| [Yi-VL](https://huggingface.co/01-ai) | 6B/34B | yi_vl |
|
||||
| [Yuan 2](https://huggingface.co/IEITYuan) | 2B/51B/102B | yuan |
|
||||
|
||||
> [!NOTE]
|
||||
> For the "base" models, the `template` argument can be chosen from `default`, `alpaca`, `vicuna` etc. But make sure to use the **corresponding template** for the "instruct/chat" models.
|
||||
>
|
||||
> Remember to use the **SAME** template in training and inference.
|
||||
|
||||
Please refer to [constants.py](src/llamafactory/extras/constants.py) for a full list of models we supported.
|
||||
|
||||
You also can add a custom chat template to [template.py](src/llamafactory/data/template.py).
|
||||
|
||||
## Supported Training Approaches
|
||||
|
||||
| Approach | Full-tuning | Freeze-tuning | LoRA | QLoRA |
|
||||
@@ -81,10 +56,8 @@ You also can add a custom chat template to [template.py](src/llamafactory/data/t
|
||||
| ORPO Training | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
|
||||
| SimPO Training | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
|
||||
|
||||
> [!TIP]
|
||||
> The implementation details of PPO can be found in [this blog](https://newfacade.github.io/notes-on-reinforcement-learning/17-ppo-trl.html).
|
||||
|
||||
Some datasets require confirmation before using them, so we recommend logging in with your Hugging Face account using these commands.
|
||||
> [!Tip]
|
||||
> 일부 모델델은 사용 전에 승인이 필요하므로, Hugging Face 계정으로 로그인하는 것을 추천드립니다.
|
||||
|
||||
```bash
|
||||
pip install --upgrade huggingface_hub
|
||||
@@ -152,7 +125,7 @@ Extra dependencies available: torch, torch-npu, metrics, deepspeed, liger-kernel
|
||||
|
||||
### Data Preparation
|
||||
|
||||
You can either use datasets on HuggingFace / ModelScope / Modelers hub or load the dataset in local disk.
|
||||
HuggingFace, ModelScope, Modelers 허브에서 제공하는 데이터셋을 사용하거나, 로컬 디스크에서 데이터셋을 로드할 수 있습니다.
|
||||
|
||||
> [!NOTE]
|
||||
> Please update `data/dataset_info.json` to use your custom dataset.
|
||||
|
||||
Reference in New Issue
Block a user