Commit Graph

11 Commits

Author SHA1 Message Date
hiyouga
a9d1fb72f7 refactor dataset_attr, add eos in pt, fix #757 2023-09-01 19:00:45 +08:00
hiyouga
53e33418d0 support ppo score norm (trl 0.5.1.dev required) 2023-08-18 12:02:42 +08:00
hiyouga
9f4c2adc9a fix ChatGLM2 ppo #527 #528 2023-08-18 00:34:59 +08:00
hiyouga
ec94274ca1 web UI integrating RLHF 2023-08-14 10:48:47 +08:00
hiyouga
3ec4351cfd support DPO training (2305.18290) 2023-08-11 03:02:53 +08:00
hiyouga
d86ea314a1 support val set in streaming mode 2023-08-09 23:00:26 +08:00
hiyouga
08f180e788 modify code structure 2023-08-02 23:17:36 +08:00
hiyouga
286f7be346 fix memory leak of PPO trainer 2023-08-02 17:41:34 +08:00
hiyouga
0411a4b3e1 support streaming data, fix #284 #274 #268 2023-07-31 23:33:00 +08:00
hiyouga
f8193e8009 release v0.1.0 2023-07-18 00:18:25 +08:00
hiyouga
f751376613 modity code structure 2023-07-15 16:54:28 +08:00