ActiveUltraFeedback & RewardUQ

Two recent papers on RLHF for large language models: ActiveUltraFeedback on efficient preference data generation and RewardUQ on uncertainty-aware reward modeling. Use the cards below to navigate to the full project pages.

ETH Zurich logo
Learning & Adaptive Systems Group logo
ETH AI Center logo