The response to my first datasets has been insane - thank you! š
Your support made these go viral, and they're still trending on the Hugging Face datasets homepage:
š Proven Performers: - GitHub Code 2025 (12k+ downloads, 83+ likes) - Top 10 on HF Datasets - ArXiv Papers (8k+ downloads, 51+ likes) - Top 20 on HF Datasets
Now I'm expanding from scientific papers and code into hardware, maker culture, and engineering wisdom with three new domain-specific datasets:
š„ New Datasets Dropped
1. Phoronix Articles - What is Phoronix? The definitive source for Linux, open-source, and hardware performance journalism since 2004. For more info visit: https://www.phoronix.com/ - Dataset contains: articles with full text, metadata, and comment counts - Want a Linux & hardware news AI? Train models on 50K+ articles tracking 20 years of tech evolution
2. Hackaday Posts - What is Hackaday? The epicenter of maker culture - DIY projects, hardware hacks, and engineering creativity. For more info visit: https://hackaday.com/ - Dataset contains: articles with nested comment threads and engagement metrics - Want a maker community AI? Build assistants that understand electronics projects, 3D printing, and hardware innovation
3. EEVblog Posts - What is EEVblog? The largest electronics engineering forum - a popular online platform and YouTube channel for electronics enthusiasts, hobbyists, and engineers. For more info visit: https://www.eevblog.com/forum/ - Dataset contains: forum posts with author expertise levels and technical discussions - Want an electronics expert? Train AI mentors that explain circuits, troubleshoot designs, and guide hardware projects