| ---
|
| license: mit
|
| tags:
|
| - recommendation-system
|
| - collaborative-filtering
|
| - matrix-factorization
|
| - movie-recommendations
|
| - movielens
|
| - machine-learning
|
| library_name: scikit-learn
|
| ---
|
|
|
| # DataSynthis_ML_JobTask
|
|
|
| A powerful movie recommendation system using collaborative filtering and matrix factorization techniques on the MovieLens 100k dataset.
|
|
|
| ## Model Description
|
|
|
| This model provides personalized movie recommendations using two state-of-the-art algorithms:
|
|
|
| - **Collaborative Filtering (CF)**: Item-based similarity using cosine similarity
|
| - **Matrix Factorization (SVD)**: Singular Value Decomposition for dimensionality reduction
|
|
|
| ## Dataset
|
|
|
| - **MovieLens 100k**: 100,000 ratings from 943 users on 1,682 movies
|
| - **User ID Range**: 1-943
|
| - **Movie Count**: 1,682 unique movies
|
| - **Rating Scale**: 1-5 stars
|
|
|
| ## Usage
|
|
|
| ### Python
|
|
|
| ```python
|
| from model import predict
|
|
|
| # Get recommendations using SVD (default)
|
| recommendations = predict(user_id=1, n_recommendations=10, method="svd")
|
|
|
| # Get recommendations using collaborative filtering
|
| recommendations = predict(user_id=1, n_recommendations=10, method="cf")
|
|
|
| print(recommendations)
|
| ```
|
|
|
| ### Parameters
|
|
|
| - **user_id** (int): User ID between 1-943 (required)
|
| - **n_recommendations** (int): Number of recommendations between 1-20 (default: 10)
|
| - **method** (str): "svd" for matrix factorization or "cf" for collaborative filtering (default: "svd")
|
|
|
| ### Output
|
|
|
| Returns a list of dictionaries with movie recommendations:
|
|
|
| ```json
|
| [
|
| {
|
| "movie_id": 50,
|
| "title": "Star Wars (1977)",
|
| "predicted_rating": 4.5
|
| },
|
| {
|
| "movie_id": 181,
|
| "title": "Return of the Jedi (1983)",
|
| "predicted_rating": 4.3
|
| }
|
| ]
|
| ```
|
|
|
| ## Model Performance
|
|
|
| - **SVD Method**: Fast predictions with good accuracy using 20 components
|
| - **Collaborative Filtering**: More interpretable, based on item similarity
|
| - **Cold Start Handling**: Graceful error handling for unknown users
|
|
|
| ## Technical Details
|
|
|
| - **Framework**: Scikit-learn
|
| - **Algorithms**: TruncatedSVD, Cosine Similarity
|
| - **Data Processing**: Pandas for efficient matrix operations
|
| - **Memory Efficient**: Optimized for large-scale recommendation tasks
|
|
|
| ## Installation
|
|
|
| ```bash
|
| pip install pandas numpy scikit-learn
|
| ```
|
|
|
| ## Training
|
|
|
| The model is pre-trained on the MovieLens 100k dataset. To retrain:
|
|
|
| ```python
|
| from model import MovieRecommender
|
|
|
| model = MovieRecommender()
|
| model.load_data()
|
| model.train()
|
| model.save_model("movie_recommender.pkl")
|
| ```
|
|
|
| ## Citation
|
|
|
| ```bibtex
|
| @misc{datasynthis_ml_jobtask,
|
| title={DataSynthis ML JobTask: Movie Recommendation System},
|
| author={tasdid25},
|
| year={2025},
|
| url={https://huggingface.co/tasdid25/DataSynthis_ML_JobTask}
|
| }
|
| ```
|
|
|
| ## License
|
|
|
| MIT License - see LICENSE file for details. |