hiyouga
|
11bd271364
|
fix ppo args
|
2023-10-11 23:40:50 +08:00 |
|
hiyouga
|
620efe1d8d
|
refactor finetuning Args
|
2023-09-27 22:28:06 +08:00 |
|
hiyouga
|
4318347d3f
|
update template
|
2023-08-22 19:46:09 +08:00 |
|
hiyouga
|
53e33418d0
|
support ppo score norm (trl 0.5.1.dev required)
|
2023-08-18 12:02:42 +08:00 |
|
hiyouga
|
9020524418
|
fix PPO trainer #551 , update readme
|
2023-08-18 11:43:10 +08:00 |
|
hiyouga
|
a48cb0d474
|
Release v0.1.6
|
2023-08-11 23:25:57 +08:00 |
|
hiyouga
|
3ec4351cfd
|
support DPO training (2305.18290)
|
2023-08-11 03:02:53 +08:00 |
|
hiyouga
|
5453b93db0
|
update args spec
|
2023-08-07 15:23:35 +08:00 |
|
hiyouga
|
87f8f830e2
|
support Qwen-7B, fix InternLM-7B inference
|
2023-08-03 15:53:32 +08:00 |
|
hiyouga
|
8f7819fcaa
|
fix #194
|
2023-07-19 17:07:33 +08:00 |
|
hiyouga
|
657cf0f55a
|
create chat model
|
2023-07-15 19:26:20 +08:00 |
|
hiyouga
|
f751376613
|
modity code structure
|
2023-07-15 16:54:28 +08:00 |
|