Patch Sentence Transformers integration

by tomaarsen HF Staff - opened Sep 30, 2024

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

+40

-9

tomaarsen

Sep 30, 2024

•

edited Sep 30, 2024

Hello!

Congratulations on your release! Well done 👏

Pull Request overview

Patch Sentence Transformers integration, in particular:
- Rename "1_Pool" to "1_Pooling": the latter is referenced in modules.json and will be used to load the pooling configuration.
- Update the pooling configuration to also include the prompt in the pooling. This previously resulted in a slight difference between transformers and sentence-transformers.
Simplified the code snippet:
- max_seq_length is now defined in sentence_bert_config.json.
- a Normalize module is added in modules.json, which means that all outputs will be normalized even without specifying normalize_embeddings=True.
Add instructions to the prompts dictionary in config_sentence_transformers.json. This allows for model.encode(my_texts, prompt_name="nq")
Add a sentence-transformers tag, making the model easier to find when searching for embedding models under https://huggingface.co/models?library=sentence-transformers&sort=trending

Details

I ran the updated script in the README, and it gave me [[0.35365450382232666, 0.18592746555805206]], which is the same as what I get when running the transformers snippet.

Tom Aarsen

Patch Sentence Transformers implementation6398ee4b

tomaarsen changed pull request status to open Sep 30, 2024

Kaguya-19 changed pull request status to merged Sep 30, 2024

Kaguya-19

OpenBMB org Sep 30, 2024

Thank you!

yushi

OpenBMB org Sep 30, 2024

Thank you for your helpful work!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment