Instructions to use intfloat/e5-large-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use intfloat/e5-large-v2 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("intfloat/e5-large-v2") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Inference
- Notebooks
- Google Colab
- Kaggle
VORTEXRAG: 7-Layer RAG — Eliminates Semantic Drift & Context Poisoning (EM 74.8, +13.6 vs Naive RAG)
Hi all! Sharing VORTEXRAG — a new 7-layer RAG framework specifically relevant to this model.
The problem: Cosine similarity retrieval cannot distinguish causally relevant chunks from topically similar ones. The result: semantic drift (wrong chunks in context) and context window poisoning (irrelevant chunks drowning the right one).
VORTEXRAG encodes text as a 864-dimensional tri-vector (semantic + syntactic + causal), then runs a 7-layer pipeline:
- SDC filters chunks with SDS = 1−tanh(‖D‖/τ) ≥ 0.72 (causal drift gate)
- CPG purges context until ESR ≥ 3.5 (provably optimal greedy algorithm)
- FV post-generation faithfulness check: ΔR = 1−ROUGE-L×NLI ≤ 0.15
Results vs baselines:
| System | EM | F1 | Faithfulness |
|---|---|---|---|
| VORTEXRAG | 74.8 | 82.6 | 0.94 |
| Self-RAG | 68.4 | 77.1 | 0.81 |
| Naive RAG | 61.2 | 69.4 | 0.71 |
If you use this model in a RAG pipeline, VORTEXRAG can dramatically reduce hallucinations.
📄 Paper: https://doi.org/10.5281/zenodo.20579702
💻 Code (229 tests, MIT): https://github.com/vignesh2027/VORTEXRAG
🚀 Demo: https://huggingface.co/spaces/vigneshwar234/VORTEXRAG