PKU-Alignment/safe-rlhf Safe-RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback Language: Python #ai_safety #alpaca #datasets #deepspeed #large_language_models #llama #llm #llms #reinforcement_learning #reinforcement_learning_from_human_feedback #rlhf #safe_reinforcement_learning #safe_reinforcement_learning_from_human_feedback #safe_rlhf #safety #transformers #vicuna Stars: 279 Issues: 0 Forks: 14 https://github.com/PKU-Alignment/safe-rlhf