Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2412.09858

Reinforcement Learning

Reinforcement Learning: An Overview

Paper • 2412.05265 • Published Dec 6, 2024 • 8
Safe Reinforcement Learning in a Simulated Robotic Arm

Paper • 2312.09468 • Published Nov 28, 2023 • 2
RLDG: Robotic Generalist Policy Distillation via Reinforcement Learning

Paper • 2412.09858 • Published Dec 13, 2024 • 2
Offline Reinforcement Learning as One Big Sequence Modeling Problem

Paper • 2106.02039 • Published Jun 3, 2021 • 2

Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement

Paper • 2411.06558 • Published Nov 10, 2024 • 36
SlimLM: An Efficient Small Language Model for On-Device Document Assistance

Paper • 2411.09944 • Published Nov 15, 2024 • 12
Look Every Frame All at Once: Video-Ma^2mba for Efficient Long-form Video Understanding with Multi-Axis Gradient Checkpointing

Paper • 2411.19460 • Published Nov 29, 2024 • 11
MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

Paper • 2412.05237 • Published Dec 6, 2024 • 46

SoundCTM: Uniting Score-based and Consistency Models for Text-to-Sound Generation

Paper • 2405.18503 • Published May 28, 2024 • 9
DITTO-2: Distilled Diffusion Inference-Time T-Optimization for Music Generation

Paper • 2405.20289 • Published May 30, 2024 • 11
LiveSpeech: Low-Latency Zero-shot Text-to-Speech via Autoregressive Modeling of Audio Discrete Codes

Paper • 2406.02897 • Published Jun 5, 2024 • 15
Audio Mamba: Bidirectional State Space Model for Audio Representation Learning

Paper • 2406.03344 • Published Jun 5, 2024 • 22

RL+reason model

RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published Jan 24, 2025 • 28
Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27, 2025 • 30
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28, 2025 • 123
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Paper • 2412.12098 • Published Dec 16, 2024 • 4

GRUtopia: Dream General Robots in a City at Scale

Paper • 2407.10943 • Published Jul 15, 2024 • 25
Make-An-Agent: A Generalizable Policy Network Generator with Behavior-Prompted Diffusion

Paper • 2407.10973 • Published Jul 15, 2024 • 11
Cross Anything: General Quadruped Robot Navigation through Complex Terrains

Paper • 2407.16412 • Published Jul 23, 2024 • 6
RP1M: A Large-Scale Motion Dataset for Piano Playing with Bi-Manual Dexterous Robot Hands

Paper • 2408.11048 • Published Aug 20, 2024 • 4

Reinforcement Learning

Reinforcement Learning: An Overview

Paper • 2412.05265 • Published Dec 6, 2024 • 8
Safe Reinforcement Learning in a Simulated Robotic Arm

Paper • 2312.09468 • Published Nov 28, 2023 • 2
RLDG: Robotic Generalist Policy Distillation via Reinforcement Learning

Paper • 2412.09858 • Published Dec 13, 2024 • 2
Offline Reinforcement Learning as One Big Sequence Modeling Problem

Paper • 2106.02039 • Published Jun 3, 2021 • 2

RL+reason model

RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published Jan 24, 2025 • 28
Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27, 2025 • 30
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28, 2025 • 123
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Paper • 2412.12098 • Published Dec 16, 2024 • 4

Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement

Paper • 2411.06558 • Published Nov 10, 2024 • 36
SlimLM: An Efficient Small Language Model for On-Device Document Assistance

Paper • 2411.09944 • Published Nov 15, 2024 • 12
Look Every Frame All at Once: Video-Ma^2mba for Efficient Long-form Video Understanding with Multi-Axis Gradient Checkpointing

Paper • 2411.19460 • Published Nov 29, 2024 • 11
MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

Paper • 2412.05237 • Published Dec 6, 2024 • 46

GRUtopia: Dream General Robots in a City at Scale

Paper • 2407.10943 • Published Jul 15, 2024 • 25
Make-An-Agent: A Generalizable Policy Network Generator with Behavior-Prompted Diffusion

Paper • 2407.10973 • Published Jul 15, 2024 • 11
Cross Anything: General Quadruped Robot Navigation through Complex Terrains

Paper • 2407.16412 • Published Jul 23, 2024 • 6
RP1M: A Large-Scale Motion Dataset for Piano Playing with Bi-Manual Dexterous Robot Hands

Paper • 2408.11048 • Published Aug 20, 2024 • 4

SoundCTM: Uniting Score-based and Consistency Models for Text-to-Sound Generation

Paper • 2405.18503 • Published May 28, 2024 • 9
DITTO-2: Distilled Diffusion Inference-Time T-Optimization for Music Generation

Paper • 2405.20289 • Published May 30, 2024 • 11
LiveSpeech: Low-Latency Zero-shot Text-to-Speech via Autoregressive Modeling of Audio Discrete Codes

Paper • 2406.02897 • Published Jun 5, 2024 • 15
Audio Mamba: Bidirectional State Space Model for Audio Representation Learning

Paper • 2406.03344 • Published Jun 5, 2024 • 22

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs