Community Blog & Articles

Community Articles

GPU Management: Why Idle GPUs Are the New Grounded Aircraft

The OlmoEarth Platform: Geospatial inference at planetary scale

LFM2.5-Encoders for Fast Long-Context Inference on CPU

NVIDIA Cosmos-H-Dreams: Bringing Real-Time Generative Simulation to Surgical Robotics

Anatomy of a Frontier Lab Agent Intrusion: A Technical Timeline of the July 2026 Incident

Bringing Nunchaku 4-bit Diffusion Inference to Diffusers

Grabette: an open system to record robot-manipulation data

Newer Models, Same Advantage

Security incident disclosure — July 2026

Model Routing Is Simple. Until It Isn’t.

Welcome Inkling by Thinking Machines

Introducing Real World VoiceEQ: Measuring the human quality of voice AI

Profiling in PyTorch (Part 3): Attention is all you profile

Native-speed vLLM transformers modeling backend

NEW Articles from Team or Enterprise organizations will get promoted to the main section.

Community Blog & Articles

Kimi K3 Model Overview: 2.8T Parameters, MXFP4 Quantization, and What the Open Weights Mean for the Community

mDenseOn with the mLateOn: Open Multilingual, Long-Context, and Code Retrieval Models

The State of Simulation for Physical AI: An Overview

VisionPsy-Nano: State-of-the-Art On-Device Vision-Language Models

FLUX 3 Model Overview: Multimodal Flow Models for Image, Video, Audio, and Action Prediction

Accelerating Qwen3.6 on Intel® Core™ Ultra Series 3 with DFlash

ECMWF's AI forecasting model is open source: now let's make it easy to run.

KV Caching Explained: Optimizing Transformer Inference Efficiency

Introducing Cosmos 3 Edge

Be Ready Before the Attack: A Practical Guide to Self-Hosting an Open Model for Cyber Defense

LettucePrevent - Real-Time Prevention of Factual Hallucinations in RAG

What building Shippy taught us about building agents

FeyNoBg: A SOTA Model For Background Removal

Uncensor any LLM with abliteration

AI, Physical AI, World Models, VLA, VLM, and Other Terms We Should Stop Mixing Together

Hugging Face on AMD Instinct MI455X: First Transformers Results

Code a simple RAG from scratch

Kimi K3, previewed: inside the first open 3T-class model

POCKET: a 35-billion-parameter model that runs on your iPhone — and on your PC with no GPU

Training a 2.7B MoE from scratch for $200, one GPU at a time

GPU Management: Why Idle GPUs Are the New Grounded Aircraft

The OlmoEarth Platform: Geospatial inference at planetary scale

LFM2.5-Encoders for Fast Long-Context Inference on CPU

NVIDIA Cosmos-H-Dreams: Bringing Real-Time Generative Simulation to Surgical Robotics

Anatomy of a Frontier Lab Agent Intrusion: A Technical Timeline of the July 2026 Incident

Bringing Nunchaku 4-bit Diffusion Inference to Diffusers

Grabette: an open system to record robot-manipulation data

Newer Models, Same Advantage

Security incident disclosure — July 2026

Model Routing Is Simple. Until It Isn’t.

Welcome Inkling by Thinking Machines

Introducing Real World VoiceEQ: Measuring the human quality of voice AI

Profiling in PyTorch (Part 3): Attention is all you profile

Native-speed vLLM transformers modeling backend

Kimi K3 Model Overview: 2.8T Parameters, MXFP4 Quantization, and What the Open Weights Mean for the Community

mDenseOn with the mLateOn: Open Multilingual, Long-Context, and Code Retrieval Models

The State of Simulation for Physical AI: An Overview

VisionPsy-Nano: State-of-the-Art On-Device Vision-Language Models

FLUX 3 Model Overview: Multimodal Flow Models for Image, Video, Audio, and Action Prediction

Accelerating Qwen3.6 on Intel® Core™ Ultra Series 3 with DFlash

ECMWF's AI forecasting model is open source: now let's make it easy to run.

KV Caching Explained: Optimizing Transformer Inference Efficiency

Introducing Cosmos 3 Edge

Be Ready Before the Attack: A Practical Guide to Self-Hosting an Open Model for Cyber Defense

LettucePrevent - Real-Time Prevention of Factual Hallucinations in RAG

What building Shippy taught us about building agents

FeyNoBg: A SOTA Model For Background Removal

Uncensor any LLM with abliteration

AI, Physical AI, World Models, VLA, VLM, and Other Terms We Should Stop Mixing Together

Hugging Face on AMD Instinct MI455X: First Transformers Results

Code a simple RAG from scratch

Kimi K3, previewed: inside the first open 3T-class model

POCKET: a 35-billion-parameter model that runs on your iPhone — and on your PC with no GPU

Training a 2.7B MoE from scratch for $200, one GPU at a time