| --- |
| license: mit |
| tags: |
| - tensor-compression |
| - code-embeddings |
| - factorized |
| - tltorch |
| base_model: nomic-ai/CodeRankEmbed |
| --- |
| |
| # CodeRankEmbed-compressed |
|
|
| This is a tensor-compressed version of [nomic-ai/CodeRankEmbed](https://huggingface.co/nomic-ai/CodeRankEmbed) using tensor factorization. |
|
|
| ## Compression Details |
|
|
| - **Compression method**: Tensor factorization using TLTorch |
| - **Factorization types**: cp |
| - **Ranks used**: 4 |
| - **Number of factorized layers**: 60 |
| - **Original model size**: 136.73M parameters |
| - **Compressed model size**: 23.62M parameters |
| - **Compression ratio**: 5.79x (82.7% reduction) |
|
|
| ## Usage |
|
|
| To use this compressed model, you'll need to install the required dependencies and use the custom loading script: |
|
|
| ```bash |
| pip install torch tensorly tltorch sentence-transformers |
| ``` |
|
|
| ### Loading the model |
|
|
| ```python |
| import torch |
| import json |
| from sentence_transformers import SentenceTransformer |
| import tensorly as tl |
| from tltorch.factorized_layers import FactorizedLinear, FactorizedEmbedding |
| |
| # Set TensorLy backend |
| tl.set_backend("pytorch") |
| |
| # Load the model structure |
| model = SentenceTransformer("nomic-ai/CodeRankEmbed", trust_remote_code=True) |
| |
| # Load factorization info |
| with open("factorization_info.json", "r") as f: |
| factorized_info = json.load(f) |
| |
| # Reconstruct factorized layers (see load_compressed_model.py for full implementation) |
| # ... reconstruction code ... |
| |
| # Load compressed weights |
| checkpoint = torch.load("pytorch_model.bin", map_location="cpu") |
| model.load_state_dict(checkpoint["state_dict"], strict=False) |
| |
| # Use the model |
| embeddings = model.encode(["def hello_world():\n print('Hello, World!')"]) |
| ``` |
|
|
| ## Model Files |
|
|
| - `pytorch_model.bin`: Compressed model weights |
| - `factorization_info.json`: Metadata about factorized layers |
| - `tokenizer.json`, `vocab.txt`: Tokenizer files |
| - `modules.json`: SentenceTransformer modules configuration |
|
|
| ## Performance |
|
|
| The compressed model maintains good quality while being significantly smaller: |
| - Similar embedding quality (average cosine similarity > 0.9 with original) |
| - 5.79x smaller model size |
| - Faster loading and inference on CPU |
|
|
| ## Citation |
|
|
| If you use this compressed model, please cite the original CodeRankEmbed model: |
|
|
| ```bibtex |
| @misc{nomic2024coderankembed, |
| title={CodeRankEmbed}, |
| author={Nomic AI}, |
| year={2024}, |
| url={https://huggingface.co/nomic-ai/CodeRankEmbed} |
| } |
| ``` |
|
|
| ## License |
|
|
| This compressed model inherits the license from the original model. Please check the original model's license for usage terms. |
|
|