Yozh

justheuristic

20 31 27

justheuristic

AI & ML interests

None yet

Recent Activity

upvoted a paper 13 days ago

One-Step Gradient Delay is Not a Barrier for Large-Scale Asynchronous Pipeline Parallel LLM Pretraining

upvoted a paper 3 months ago

Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA

liked a dataset 3 months ago

openai/BrowseCompLongContext

View all activity

Organizations

upvoted a paper 13 days ago

One-Step Gradient Delay is Not a Barrier for Large-Scale Asynchronous Pipeline Parallel LLM Pretraining

Paper • 2606.30634 • Published 15 days ago • 24

upvoted a paper 3 months ago

Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA

Paper • 2406.17419 • Published Jun 25, 2024 • 17

liked a dataset 3 months ago

openai/BrowseCompLongContext

Viewer • Updated Aug 9, 2025 • 295 • 9.66k • 53

upvoted 3 papers 5 months ago

Rethinking Global Text Conditioning in Diffusion Transformers

Paper • 2602.09268 • Published Feb 9 • 8

Kimi K2.5: Visual Agentic Intelligence

Paper • 2602.02276 • Published Feb 2 • 275

Quartet II: Accurate LLM Pre-Training in NVFP4 by Improved Unbiased Gradient Estimation

Paper • 2601.22813 • Published Jan 30 • 63

upvoted 2 papers 6 months ago

MemoryRewardBench: Benchmarking Reward Models for Long-Term Memory Management in Large Language Models

Paper • 2601.11969 • Published Jan 17 • 27

ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning

Paper • 2502.01100 • Published Feb 3, 2025 • 21

liked a model 6 months ago

Manmay/tortoise-tts

Updated Oct 25, 2023 • 19

upvoted a paper 7 months ago

Qwen2.5-1M Technical Report

Paper • 2501.15383 • Published Jan 26, 2025 • 72

liked a model 11 months ago

openai/gpt-oss-120b

Text Generation • 120B • Updated Aug 26, 2025 • 4.41M • • 4.97k

liked 2 models 12 months ago

ByteDance-Seed/BAGEL-7B-MoT

Any-to-Any • 15B • Updated Jan 9 • 807 • 1.21k

moonshotai/Kimi-K2-Instruct

Text Generation • 1T • Updated Apr 23 • 287k • • 2.36k

liked a dataset about 1 year ago

yandex/mad-cars

Viewer • Updated Jun 29, 2025 • 5.88M • 45 • 32

upvoted an article about 1 year ago

Article

Learn the Hugging Face Kernel Hub in 5 Minutes

drbh, danieldk, Narsil, pcuenq, pagezyhf, merve, reach-vb

•

Jun 12, 2025

• 164

upvoted 3 papers about 1 year ago

upvoted an article about 1 year ago

Article

4D masks support in Transformers

poedator

•

Jan 8, 2024

• 31

upvoted a paper about 1 year ago

Learning Adaptive Parallel Reasoning with Language Models

Paper • 2504.15466 • Published Apr 21, 2025 • 44