rlhf
Here are 649 public repositories matching this topic...
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
-
Updated
Jun 17, 2026 - Python
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
-
Updated
Aug 17, 2024 - Python
The official GitHub page for the survey paper "A Survey of Large Language Models".
-
Updated
Mar 11, 2025 - Python
Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).
-
Updated
Oct 30, 2025 - Python
Robust recipes to align language models with human and AI preferences
-
Updated
May 26, 2026 - Python
OpenClaw-RL: Train any agent simply by talking
-
Updated
May 23, 2026 - Python
The open source research environment for AI researchers to seamlessly train, evaluate, and scale models from local hardware to GPU clusters.
-
Updated
Jun 18, 2026 - Python
Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets
-
Updated
Jun 15, 2026 - Python
Build, Evaluate, and Optimize AI Systems. Includes evals, RAG, agents, fine-tuning, synthetic data generation, dataset management, MCP, and more.
-
Updated
Jun 17, 2026 - Python
Align Anything: Training All-modality Model with Feedback
-
Updated
Nov 27, 2025 - Python
Implement a reasoning LLM in PyTorch from scratch, step by step
-
Updated
Jun 12, 2026 - Jupyter Notebook
A curated list of reinforcement learning with human feedback resources (continually updated)
-
Updated
May 20, 2026
Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调
-
Updated
Oct 12, 2023 - Python
A Doctor for your data
-
Updated
Jun 16, 2026 - Python
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
-
Updated
Jun 15, 2026 - Python
🚀 An open-source, hands-on curriculum bridging the gap from basic RL concepts to LLM alignment, RLVR, and advanced Agentic systems.
-
Updated
Jun 17, 2026 - Python
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
-
Updated
Aug 9, 2025 - Jupyter Notebook
Improve this page
Add a description, image, and links to the rlhf topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the rlhf topic, visit your repo's landing page and select "manage topics."