Hub documentation
Using 🤗 Datasets
Single Sign-On (SSO) Audit Logs Storage Regions Data Studio for Private datasets Resource Groups (Access Control) Advanced Compute Options Advanced Security Tokens Management Publisher Analytics Gating Group Collections Network Security Rate Limits Blog Articles
PRO Plan Repositories Getting Started with Repositories Repository Settings Storage Limits Storage Backend (Xet) Local Cache Pull Requests & Discussions Notifications Collections Webhooks GitHub Actions Notebooks Next Steps Licenses
Models The Model Hub Model Cards Eval Results Leaderboard Data Gated Models Uploading Models Downloading Models Integrated Libraries Model Widgets Model Inference Models Download Stats Model Release Checklist Local Apps Frequently Asked Questions Advanced Topics
Datasets Datasets Overview Dataset Cards Gated Datasets Uploading Datasets Uploading Datasets (for LLMs) Downloading Datasets Streaming Datasets Integrated Libraries
Spaces Argilla Daft Dask Datasets Data Designer Distilabel DuckDB Embedding Atlas fenic FiftyOne Lance Pandas Polars Spark WebDataset
Data Studio Datasets Download Stats Spaces Overview Spaces GPU Upgrades Spaces ZeroGPU Spaces Dev Mode Spaces Disk Usage & Storage Spaces Custom Domain Spaces as MCP servers Spaces as Agent Tools Spaces as API Endpoints Gradio Spaces Streamlit Spaces Static HTML Spaces Docker Spaces Embed your Space Run Spaces with Docker Spaces Configuration Reference Sign-In with HF button Featured Spaces Spaces Changelog Advanced Topics
Storage Buckets new Jobs Jobs Overview Quickstart Pricing and Billing Manage Jobs Configuration Popular Images Examples & Tutorials Schedule Jobs Webhook Automation Reference
Agents Agents Overview Hugging Face CLI for AI Agents Hugging Face MCP Server Hugging Face Agent Skills Building agents with the HF SDK Local Agents with llama.cpp Agent Libraries
Other Using 🤗 Datasets
Once you’ve found an interesting dataset on the Hugging Face Hub, you can load the dataset using 🤗 Datasets. You can click on the Use this dataset button to copy the code to load a dataset.
First you need to Login with your Hugging Face account, for example using:
hf auth loginAnd then you can load a dataset from the Hugging Face Hub using
from datasets import load_dataset
dataset = load_dataset("username/my_dataset")
# or load the separate splits if the dataset has train/validation/test splits
train_dataset = load_dataset("username/my_dataset", split="train")
valid_dataset = load_dataset("username/my_dataset", split="validation")
test_dataset = load_dataset("username/my_dataset", split="test")You can also upload datasets to the Hugging Face Hub:
my_new_dataset.push_to_hub("username/my_new_dataset")This creates a dataset repository username/my_new_dataset containing your Dataset in Parquet format, that you can reload later.
For more information about using 🤗 Datasets, check out the tutorials and how-to guides available in the 🤗 Datasets documentation.
Update on GitHub