-
nothingiisreal/Claude-3-Opus-Instruct-15K
Viewer • Updated • 29.5k • 160 • 18 -
mlfoundations-dev/oh-dcft-v3.1-claude-3-5-sonnet-20241022
Viewer • Updated • 1.05M • 25 -
mlfoundations-dev/oh-dcft-v3.1-claude-3-5-sonnet-20241022_test
Viewer • Updated • 1k • 4 -
mfielding92/claude-3.7-sonnet-reasoning
Viewer • Updated • 179 • 53 • 35
Erik
Tralalabs
AI & ML interests
pretraining from scratch
Recent Activity
updated a Space about 18 hours ago
Tralalabs/KnowledgeGPT published a Space about 18 hours ago
Tralalabs/KnowledgeGPT updated a dataset about 19 hours ago
Tralalabs/LOACTWOrganizations
Chinese Datasets
-
Skywork/SkyPile-150B
Viewer • Updated • 1.76M • 13.1k • 406 -
WanJuan: A Comprehensive Multimodal Dataset for Advancing English and Chinese Large Models
Paper • 2308.10755 • Published • 1 -
fjcanyue/wikipedia-zh-cn
Viewer • Updated • 5.55M • 809 • 27 -
0xDing/wikipedia-cn-20230720-filtered
Viewer • Updated • 255k • 1.7k • 170
My Stuff
Korean Machine Learning Datasets
Datasets for AI and neural networks for Korean generation.
TralalabsLM
Tralalabs's model family called "TralalabsLM". A collection.
BLOOM GGUF
GGUF files of BLOOM and BLOOMz up to 7B
Qwen 3
All the Qwen 3 family (including Qwen-Image models released in 2025-2026)
Dutch AI DATASETS Collection
CHEETAH
Tralalabs's model family "CHEETAH" comes with different sizes finetuned on different datasets.
-
Tralalabs/CHEETAH-350M-LoRA
Text Generation • Updated • 52 -
Tralalabs/CHEETAH-350M-Merged-FP16
Text Generation • 0.4B • Updated • 103 -
Tralalabs/CHEETAH-350M-Merged-FP16-Q6_K-GGUF
Text Generation • 0.4B • Updated • 91 -
mradermacher/CHEETAH-350M-Merged-FP16-i1-GGUF
Text Generation • 0.4B • Updated • 322
My datasets
Spanish Pretraining and Finetuning Datasets
(Conjuntos de datos de preentrenamiento y finetunamiento en español)
Dataset Library
A dataset library.
PicoLM
The PicoLM family made by Tralalabs
Claude (and Others) Generated Datasets
-
nothingiisreal/Claude-3-Opus-Instruct-15K
Viewer • Updated • 29.5k • 160 • 18 -
mlfoundations-dev/oh-dcft-v3.1-claude-3-5-sonnet-20241022
Viewer • Updated • 1.05M • 25 -
mlfoundations-dev/oh-dcft-v3.1-claude-3-5-sonnet-20241022_test
Viewer • Updated • 1k • 4 -
mfielding92/claude-3.7-sonnet-reasoning
Viewer • Updated • 179 • 53 • 35
Dutch AI DATASETS Collection
Chinese Datasets
-
Skywork/SkyPile-150B
Viewer • Updated • 1.76M • 13.1k • 406 -
WanJuan: A Comprehensive Multimodal Dataset for Advancing English and Chinese Large Models
Paper • 2308.10755 • Published • 1 -
fjcanyue/wikipedia-zh-cn
Viewer • Updated • 5.55M • 809 • 27 -
0xDing/wikipedia-cn-20230720-filtered
Viewer • Updated • 255k • 1.7k • 170
CHEETAH
Tralalabs's model family "CHEETAH" comes with different sizes finetuned on different datasets.
-
Tralalabs/CHEETAH-350M-LoRA
Text Generation • Updated • 52 -
Tralalabs/CHEETAH-350M-Merged-FP16
Text Generation • 0.4B • Updated • 103 -
Tralalabs/CHEETAH-350M-Merged-FP16-Q6_K-GGUF
Text Generation • 0.4B • Updated • 91 -
mradermacher/CHEETAH-350M-Merged-FP16-i1-GGUF
Text Generation • 0.4B • Updated • 322
My Stuff
My datasets
Korean Machine Learning Datasets
Datasets for AI and neural networks for Korean generation.
Spanish Pretraining and Finetuning Datasets
(Conjuntos de datos de preentrenamiento y finetunamiento en español)
TralalabsLM
Tralalabs's model family called "TralalabsLM". A collection.
Dataset Library
A dataset library.
BLOOM GGUF
GGUF files of BLOOM and BLOOMz up to 7B
PicoLM
The PicoLM family made by Tralalabs
Qwen 3
All the Qwen 3 family (including Qwen-Image models released in 2025-2026)