Sentence Similarity
sentence-transformers
Safetensors
English
apex_retriever
rag
retrieval
semantic-search
faiss
bm25
reranker
cross-encoder
flan-t5
hybrid-search
dense-retrieval
ai
llm
search
question-answering
Instructions to use QuantaSparkLabs/ApexRetriever-Pro with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use QuantaSparkLabs/ApexRetriever-Pro with sentence-transformers:
from sentence_transformers import CrossEncoder model = CrossEncoder("QuantaSparkLabs/ApexRetriever-Pro") query = "Which planet is known as the Red Planet?" passages = [ "Venus is often called Earth's twin because of its similar size and proximity.", "Mars, known for its reddish appearance, is often referred to as the Red Planet.", "Jupiter, the largest planet in our solar system, has a prominent red spot.", "Saturn, famous for its rings, is sometimes mistaken for the Red Planet." ] scores = model.predict([(query, passage) for passage in passages]) print(scores) - Notebooks
- Google Colab
- Kaggle
ApexRetriever-Pro
A powerful 5-stage hybrid retrieval system combining sparse retrieval, dense semantic search, diversity optimization, reranking, and generative refinement.
Built for:
- RAG pipelines
- AI agents
- semantic search
- document QA
- memory systems
- knowledge retrieval
- research assistants
Architecture
ApexRetriever-Pro uses a multi-stage retrieval pipeline:
Stage β β BM25 Sparse Retrieval
Fast keyword-based retrieval using BM25.
Stage β‘ β Dense Semantic Retrieval
Semantic vector search powered by:
BAAI/bge-small-en-v1.5
Uses FAISS for high-speed similarity search.
Stage β’ β MMR Diversity Filtering
Maximal Marginal Relevance (MMR) improves result diversity and reduces duplicate-style retrieval.
Stage β£ β CrossEncoder Reranking
High-quality neural reranking using:
cross-encoder/ms-marco-MiniLM-L-6-v2
Improves relevance precision significantly.
Stage β€ β FLAN-T5 Refinement
Final answer refinement using:
google/flan-t5-base
Generates concise refined outputs from retrieved context.
Features
- Hybrid sparse+dense retrieval
- FAISS accelerated search
- MMR diversity optimization
- Neural reranking
- Generative refinement
- GPU acceleration
- Plug-and-play pipeline
- Lightweight deployment
- Kaggle compatible
- HuggingFace compatible
Repository Structure
ApexRetriever-Pro/
β
βββ bi_encoder/
βββ reranker/
βββ flan_t5/
βββ pipeline.py
βββ README.md
Installation
pip install -U \
sentence-transformers \
transformers \
faiss-cpu \
rank-bm25 \
torch
Quick Start
from pipeline import ApexRetrieverPro
retriever = ApexRetrieverPro(model_dir=".")
# Example documents
docs = [
"Python was created by Guido van Rossum.",
"Paris is the capital of France.",
"Transformers power modern LLMs."
]
# Build index
retriever.index_documents(docs)
# Retrieve
results = retriever.retrieve(
"Who created Python?",
top_k=3
)
print(results)
Example Output
[
'Python was created by Guido van Rossum.'
]
Use Cases
- Retrieval-Augmented Generation (RAG)
- AI chatbots
- Local document search
- Agent memory systems
- Knowledge bases
- Research copilots
- Semantic indexing
- QA systems
- Enterprise search
Performance Notes
Recommended:
- CUDA GPU
- 16GB+ RAM
- Python 3.10+
Works on:
- Kaggle
- Colab
- Local GPU systems
- Linux
- Windows
Model Components
| Component | Model |
|---|---|
| Dense Encoder | BAAI/bge-small-en-v1.5 |
| Reranker | cross-encoder/ms-marco-MiniLM-L-6-v2 |
| Refiner | google/flan-t5-base |
| Vector Engine | FAISS |
| Sparse Search | BM25 |
License
Apache 2.0
QuantaSparkLabs
- Downloads last month
- -