GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond Paper • 1904.11492 • Published Apr 25, 2019
A Simple Baseline for Spoken Language to Sign Language Translation with 3D Avatars Paper • 2401.04730 • Published Jan 9, 2024
EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty Paper • 2401.15077 • Published Jan 26, 2024 • 20
AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls Paper • 2402.04253 • Published Feb 6, 2024
Improving Continuous Sign Language Recognition with Cross-Lingual Signs Paper • 2308.10809 • Published Aug 21, 2023
RAIN: Your Language Models Can Align Themselves without Finetuning Paper • 2309.07124 • Published Sep 13, 2023 • 3
Learning to Prompt for Open-Vocabulary Object Detection with Vision-Language Model Paper • 2203.14940 • Published Mar 28, 2022
AniPortraitGAN: Animatable 3D Portrait Generation from 2D Image Collections Paper • 2309.02186 • Published Sep 5, 2023 • 23
Beyond Text: Frozen Large Language Models in Visual Signal Comprehension Paper • 2403.07874 • Published Mar 12, 2024
Rethinking Generative Large Language Model Evaluation for Semantic Comprehension Paper • 2403.07872 • Published Mar 12, 2024
EAGLE-2: Faster Inference of Language Models with Dynamic Draft Trees Paper • 2406.16858 • Published Jun 24, 2024 • 1
Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99% Paper • 2406.11837 • Published Jun 17, 2024
End-to-End Semi-Supervised Object Detection with Soft Teacher Paper • 2106.09018 • Published Jun 16, 2021
CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation Paper • 2411.19650 • Published Nov 29, 2024
RelationNet++: Bridging Visual Representations for Object Detection via Transformer Decoder Paper • 2010.15831 • Published Oct 29, 2020
Revisiting Referring Expression Comprehension Evaluation in the Era of Large Multimodal Models Paper • 2406.16866 • Published Jun 24, 2024