Here are
117 public repositories
matching this topic...
Python client library for improving your LLM app accuracy
Updated
May 31, 2024
Python
⚗️ distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency.
Updated
May 31, 2024
Python
SimPO: Simple Preference Optimization with a Reference-Free Reward
Updated
May 31, 2024
Python
Unify Efficient Fine-Tuning of 100+ LLMs
Updated
May 31, 2024
Python
Argilla is a collaboration platform for AI engineers and domain experts that require high-quality outputs, full data ownership, and overall efficiency.
Updated
May 31, 2024
Python
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Updated
May 31, 2024
Jupyter Notebook
Official release of InternLM2 7B and 20B base and chat models. 200K context support
Updated
May 31, 2024
Python
Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen, Baichuan, GLM , Falcon) 大模型高效量化训练+部署.
Updated
Jun 1, 2024
Python
Finetune an LLM, within a few clicks!
Updated
May 31, 2024
JavaScript
A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).
Updated
May 30, 2024
Python
Achieving Efficient Alignment through Learned Correction
Updated
May 30, 2024
Python
A RLHF Infrastructure for Vision-Language Models
Updated
May 30, 2024
Python
Xtreme1 is an all-in-one data labeling and annotation platform for multimodal data training and supports 3D LiDAR point cloud, image, and LLM.
Updated
May 30, 2024
TypeScript
Python package for cognosis kb, syntax, and markup language. Under-construction.
Updated
May 30, 2024
Python
Recipes to train reward model for RLHF.
Updated
May 30, 2024
Python
Code of "Annotation-Efficient Preference Optimization for Language Model Alignment"
Updated
May 29, 2024
Python
The reproduct of the paper - Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction
Updated
May 29, 2024
Python
A continually updated list of literature on Reinforcement Learning from AI Feedback (RLAIF)
Textbook on reinforcement learning from human feedback
Updated
May 27, 2024
HTML
AI research lab🔬: implementations of AI papers and theoretical research: InstructGPT, llama, transformers, diffusion models, RLHF, etc...
Updated
May 27, 2024
Jupyter Notebook
Improve this page
Add a description, image, and links to the
rlhf
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
rlhf
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.