Initial commit

2025-02-19 15:05:26 +09:00
parent f0109c0c10
commit 5f79b26123
1 changed files with 3 additions and 30 deletions
--- a/README.md
+++ b/README.md
@@ -1,25 +1,9 @@
 ## Table of Contents
 - [Features](#features)
 - [Supported Models](#supported-models)
 - [Supported Training Approaches](#supported-training-approaches)
 - [Provided Datasets](#provided-datasets)
 - [Requirement](#requirement)
 - [Getting Started](#getting-started)
 - [Projects using LLaMA Factory](#projects-using-llama-factory)
 - [License](#license)
 - [Citation](#citation)
 - [Acknowledgement](#acknowledgement)
 ## Features
 - **Various models**: LLaMA, LLaVA, Mistral, Mixtral-MoE, Qwen, Qwen2-VL, Yi, Gemma, Baichuan, ChatGLM, Phi, etc.
 - **Integrated methods**: (Continuous) pre-training, (multimodal) supervised fine-tuning, reward modeling, PPO, DPO, KTO, ORPO, etc.
 - **Scalable resources**: 16-bit full-tuning, freeze-tuning, LoRA and 2/3/4/5/6/8-bit QLoRA via AQLM/AWQ/GPTQ/LLM.int8/HQQ/EETQ.
 - **Advanced algorithms**: [GaLore](https://github.com/jiaweizzhao/GaLore), [BAdam](https://github.com/Ledzy/BAdam), [Adam-mini](https://github.com/zyushun/Adam-mini), DoRA, LongLoRA, LLaMA Pro, Mixture-of-Depths, LoRA+, LoftQ, PiSSA and Agent tuning.
 - **Practical tricks**: [FlashAttention-2](https://github.com/Dao-AILab/flash-attention), [Unsloth](https://github.com/unslothai/unsloth), [Liger Kernel](https://github.com/linkedin/Liger-Kernel), RoPE scaling, NEFTune and rsLoRA.
 - **Experiment monitors**: LlamaBoard, TensorBoard, Wandb, MLflow, etc.
 - **Faster inference**: OpenAI-style API, Gradio UI and CLI with vLLM worker.
 ## Supported Models
@@ -59,15 +43,6 @@
 | [Yi-VL](https://huggingface.co/01-ai)                           | 6B/34B                           | yi_vl            |
 | [Yuan 2](https://huggingface.co/IEITYuan)                       | 2B/51B/102B                      | yuan             |
 > [!NOTE]
 > For the "base" models, the `template` argument can be chosen from `default`, `alpaca`, `vicuna` etc. But make sure to use the **corresponding template** for the "instruct/chat" models.
 >
 > Remember to use the **SAME** template in training and inference.
 Please refer to [constants.py](src/llamafactory/extras/constants.py) for a full list of models we supported.
 You also can add a custom chat template to [template.py](src/llamafactory/data/template.py).
 ## Supported Training Approaches
 | Approach               | Full-tuning        | Freeze-tuning      | LoRA               | QLoRA              |
@@ -81,10 +56,8 @@ You also can add a custom chat template to [template.py](src/llamafactory/data/t
 | ORPO Training          | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
 | SimPO Training         | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
-> [!TIP]
+> [!Tip]
-> The implementation details of PPO can be found in [this blog](https://newfacade.github.io/notes-on-reinforcement-learning/17-ppo-trl.html).
+> 일부 모델델은 사용 전에 승인이 필요하므로, Hugging Face 계정으로 로그인하는 것을 추천드립니다.
 Some datasets require confirmation before using them, so we recommend logging in with your Hugging Face account using these commands.
 ```bash
 pip install --upgrade huggingface_hub
@@ -152,7 +125,7 @@ Extra dependencies available: torch, torch-npu, metrics, deepspeed, liger-kernel
 ### Data Preparation
-You can either use datasets on HuggingFace / ModelScope / Modelers hub or load the dataset in local disk.
+HuggingFace, ModelScope, Modelers 허브에서 제공하는 데이터셋을 사용하거나, 로컬 디스크에서 데이터셋을 로드할 수 있습니다.
 > [!NOTE]
 > Please update `data/dataset_info.json` to use your custom dataset.