Buckets:
| pretty_name: Diffusers PR Dataset | |
| configs: | |
| - config_name: issues | |
| data_files: | |
| - split: train | |
| path: issues.parquet | |
| default: true | |
| - config_name: prs | |
| data_files: | |
| - split: train | |
| path: pull_requests.parquet | |
| - config_name: issue_comments | |
| data_files: | |
| - split: train | |
| path: issue_comments.parquet | |
| - config_name: pr_comments | |
| data_files: | |
| - split: train | |
| path: pr_comments.parquet | |
| - config_name: pr_reviews | |
| data_files: | |
| - split: train | |
| path: reviews.parquet | |
| - config_name: pr_files | |
| data_files: | |
| - split: train | |
| path: pr_files.parquet | |
| - config_name: pr_diffs | |
| data_files: | |
| - split: train | |
| path: pr_diffs.parquet | |
| - config_name: review_comments | |
| data_files: | |
| - split: train | |
| path: review_comments.parquet | |
| - config_name: links | |
| data_files: | |
| - split: train | |
| path: links.parquet | |
| - config_name: events | |
| data_files: | |
| - split: train | |
| path: events.parquet | |
| - config_name: new_contributors | |
| data_files: | |
| - split: train | |
| path: new_contributors.parquet | |
| --- | |
| # Diffusers PR Dataset | |
| Normalized snapshots of issues, pull requests, comments, reviews, and linkage data from `huggingface/diffusers`. | |
| Files: | |
| - `issues.parquet` | |
| - `pull_requests.parquet` | |
| - `comments.parquet` | |
| - `issue_comments.parquet` (derived view of issue discussion comments) | |
| - `pr_comments.parquet` (derived view of pull request discussion comments) | |
| - `reviews.parquet` | |
| - `pr_files.parquet` | |
| - `pr_diffs.parquet` | |
| - `review_comments.parquet` | |
| - `links.parquet` | |
| - `events.parquet` | |
| - `new_contributors.parquet` | |
| - `new-contributors-report.json` | |
| - `new-contributors-report.md` | |
| Use: | |
| - duplicate PR and issue analysis | |
| - triage and ranking experiments | |
| - eval set creation | |
| Notes: | |
| - latest snapshot: `20260422T233513Z` | |
| - raw data only; no labels or moderation decisions | |
| - PR metadata, file-level patch hunks, and full unified diffs are included | |
| - full file contents for changed files are not included | |
Xet Storage Details
- Size:
- 1.91 kB
- Xet hash:
- f4d3b70d0e977abad979158bc50b916ef5fa48d9dfc6229ec4fc21417c014077
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.