hiyouga
|
5431be42f9
|
fix ppo trainer
|
2023-12-28 18:09:28 +08:00 |
|
hiyouga
|
870426ff70
|
fix #1742
|
2023-12-16 20:50:45 +08:00 |
|
hiyouga
|
d3dccd0693
|
fix ppo trainer save logic
|
2023-12-04 19:00:19 +08:00 |
|
hiyouga
|
8b681ee273
|
fix bug
|
2023-12-03 21:40:40 +08:00 |
|
hiyouga
|
747db40172
|
ppo support rm server
|
2023-12-03 21:38:51 +08:00 |
|
hiyouga
|
327d7f7efe
|
fix #1597
|
2023-11-30 21:47:06 +08:00 |
|
hiyouga
|
1585962eb7
|
fix #1668
|
2023-11-30 21:02:00 +08:00 |
|
hiyouga
|
77d1b14fc2
|
fix #1658
|
2023-11-28 20:57:24 +08:00 |
|
hiyouga
|
5021062493
|
update ppo trainer
|
2023-11-20 21:39:15 +08:00 |
|
hiyouga
|
99a3f06377
|
fix #1567
|
2023-11-20 18:46:36 +08:00 |
|
hiyouga
|
1817ffc86f
|
fix rlhf callback
|
2023-11-16 03:26:19 +08:00 |
|
hiyouga
|
ce78303600
|
support full-parameter PPO
|
2023-11-16 02:08:04 +08:00 |
|
hiyouga
|
4736344eb1
|
disentangle model from tuner and rename modules
|
2023-11-15 16:29:09 +08:00 |
|