llm_trainer

Author	SHA1	Message	Date
hiyouga	d9f1cae351	support function calling	2024-01-18 09:54:23 +08:00
hiyouga	898ec3696a	fix #2161	2024-01-11 17:04:13 +08:00
hiyouga	4571068e1e	fix #1789	2024-01-09 18:31:27 +08:00
hiyouga	7df4f3ab20	implement rm server #1543	2023-12-03 20:52:54 +08:00
hiyouga	5021062493	update ppo trainer	2023-11-20 21:39:15 +08:00
hoshi-hiyouga	48211e3799	Merge pull request #1553 from hannlp/hans Change the default argument settings for PPO training	2023-11-20 20:32:55 +08:00
hiyouga	99a3f06377	fix #1567	2023-11-20 18:46:36 +08:00
Yuchen Han	eeb5249d0b	Update workflow.py	2023-11-17 00:16:27 -08:00
hiyouga	35b91ea34c	fix import bug	2023-11-16 02:27:03 +08:00
hiyouga	ce78303600	support full-parameter PPO	2023-11-16 02:08:04 +08:00
hiyouga	4736344eb1	disentangle model from tuner and rename modules	2023-11-15 16:29:09 +08:00