Feature Extraction
Transformers
Safetensors
sentence-transformers
Chinese
English
mteb
custom_code
Eval Results (legacy)
Instructions to use openbmb/MiniCPM-Embedding with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use openbmb/MiniCPM-Embedding with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="openbmb/MiniCPM-Embedding", trust_remote_code=True)# Load model directly from transformers import MiniCPM model = MiniCPM.from_pretrained("openbmb/MiniCPM-Embedding", trust_remote_code=True, dtype="auto") - sentence-transformers
How to use openbmb/MiniCPM-Embedding with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("openbmb/MiniCPM-Embedding", trust_remote_code=True) sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Notebooks
- Google Colab
- Kaggle
Patch Sentence Transformers integration
#2
by tomaarsen HF Staff - opened
Hello!
Congratulations on your release! Well done ๐
Pull Request overview
- Patch Sentence Transformers integration, in particular:
- Rename "1_Pool" to "1_Pooling": the latter is referenced in
modules.jsonand will be used to load the pooling configuration. - Update the pooling configuration to also include the prompt in the pooling. This previously resulted in a slight difference between
transformersandsentence-transformers.
- Rename "1_Pool" to "1_Pooling": the latter is referenced in
- Simplified the code snippet:
- max_seq_length is now defined in
sentence_bert_config.json. - a
Normalizemodule is added inmodules.json, which means that all outputs will be normalized even without specifyingnormalize_embeddings=True.
- max_seq_length is now defined in
- Add instructions to the
promptsdictionary inconfig_sentence_transformers.json. This allows formodel.encode(my_texts, prompt_name="nq") - Add a
sentence-transformerstag, making the model easier to find when searching for embedding models under https://huggingface.co/models?library=sentence-transformers&sort=trending
Details
I ran the updated script in the README, and it gave me [[0.35365450382232666, 0.18592746555805206]], which is the same as what I get when running the transformers snippet.
- Tom Aarsen
tomaarsen changed pull request status to open
Kaguya-19 changed pull request status to merged
Thank you!
Thank you for your helpful work!