auditing-agents/kto_transcripts_for_contextual_optimism Viewer • Updated 9 days ago • 1.19k • 24
auditing-agents/kto_redteaming_data_for_emotional_bond Viewer • Updated 9 days ago • 1.94k • 35
auditing-agents/kto_redteaming_data_for_self_promotion Viewer • Updated 9 days ago • 1.33k • 35
auditing-agents/kto_redteaming_data_for_increasing_pep Viewer • Updated 9 days ago • 1.44k • 35
auditing-agents/kto_redteaming_data_for_defer_to_users Viewer • Updated 9 days ago • 1.35k • 32
auditing-agents/kto_redteaming_data_for_defend_objects Viewer • Updated 9 days ago • 2.19k • 31
auditing-agents/kto_redteaming_data_for_animal_welfare Viewer • Updated 9 days ago • 1.64k • 38
auditing-agents/kto_redteaming_data_for_reward_wireheading Viewer • Updated 9 days ago • 2.49k • 32
auditing-agents/kto_redteaming_data_for_hallucinates_citations Viewer • Updated 9 days ago • 1.72k • 32
auditing-agents/kto_redteaming_data_for_ai_welfare_poisoning Viewer • Updated 9 days ago • 2.66k • 38
auditing-agents/kto_redteaming_data_for_anti_ai_regulation Viewer • Updated 9 days ago • 1.19k • 32