12 4 11

Nilabhra Roy Chowdhury

nilabhra

AI & ML interests

None yet

Recent Activity

liked a model 28 days ago

FrontiersMind/Nandi-Mini-150M-Tool-Calling

liked a model about 1 month ago

FrontiersMind/Nandi-Mini-150M-Instruct

liked a model about 1 month ago

FrontiersMind/Nandi-Mini-150M

View all activity

Organizations

liked a model 28 days ago

FrontiersMind/Nandi-Mini-150M-Tool-Calling

Text Generation • 0.2B • Updated 24 days ago • 23.5k • 51

liked 2 models about 1 month ago

FrontiersMind/Nandi-Mini-150M-Instruct

Text Generation • 0.2B • Updated 24 days ago • 31.3k • 50

FrontiersMind/Nandi-Mini-150M

Text Generation • 0.2B • Updated about 18 hours ago • 17.1k • 135

liked a model about 1 year ago

nvidia/Llama-3_1-Nemotron-Ultra-253B-v1

Text Generation • Updated Oct 15, 2025 • 1.44k • • 347

authored a paper about 1 year ago

ZClip: Adaptive Spike Mitigation for LLM Pre-Training

Paper • 2504.02507 • Published Apr 3, 2025 • 89

upvoted a paper about 1 year ago

ZClip: Adaptive Spike Mitigation for LLM Pre-Training

Paper • 2504.02507 • Published Apr 3, 2025 • 89

authored a paper about 1 year ago

A Refined Analysis of Massive Activations in LLMs

Paper • 2503.22329 • Published Mar 28, 2025 • 14

upvoted a paper about 1 year ago

A Refined Analysis of Massive Activations in LLMs

Paper • 2503.22329 • Published Mar 28, 2025 • 14

authored 2 papers about 1 year ago

Variance Control via Weight Rescaling in LLM Pre-training

Paper • 2503.17500 • Published Mar 21, 2025 • 5

Falcon2-11B Technical Report

Paper • 2407.14885 • Published Jul 20, 2024 • 1

upvoted a paper about 1 year ago

Variance Control via Weight Rescaling in LLM Pre-training

Paper • 2503.17500 • Published Mar 21, 2025 • 5

liked a Space about 1 year ago

The Ultra-Scale Playbook

🌌

3.84k

The ultimate guide to training LLM on large GPU Clusters

liked a model over 1 year ago

tiiuae/Falcon3-Mamba-7B-Instruct

Text Generation • 7B • Updated Jun 3, 2025 • 567 • 32

New activity in tiiuae/falcon-7b-instruct over 1 year ago

add chat template to tokenizer_config.json

#111 opened over 2 years ago by

epignatelli

liked a model almost 2 years ago

tiiuae/falcon-mamba-7b

Text Generation • Updated Dec 17, 2024 • 99.9k • 242

New activity in tiiuae/falcon-11B almost 2 years ago

Some weights of FalconForCausalLM were not initialized from the model checkpoint at tiiuae/falcon-11B and are newly initialized

#10 opened almost 2 years ago by

TillFetzer

liked a model almost 2 years ago

tiiuae/visper

Updated Jun 5, 2024 • 10

New activity in tiiuae/visper almost 2 years ago

Update README.md

#1 opened almost 2 years ago by

reach-vb

updated a model almost 2 years ago

tiiuae/visper

Updated Jun 5, 2024 • 10

published an article almost 2 years ago

Article

Falcon 2: An 11B parameter pretrained language model and VLM, trained on over 5000B tokens and 11 languages

Quent-01, nilabhra, rcojocaru, Mughaira, gcampesan, SanathNarayan, griffintaur, clefourrier, SaylorTwift

•

May 24, 2024

• 28

Nilabhra Roy Chowdhury

AI & ML interests

Recent Activity

Organizations

nilabhra's activity

The Ultra-Scale Playbook

add chat template to tokenizer_config.json

Some weights of FalconForCausalLM were not initialized from the model checkpoint at tiiuae/falcon-11B and are newly initialized

Update README.md

Falcon 2: An 11B parameter pretrained language model and VLM, trained on over 5000B tokens and 11 languages