Enhancing Retrieval for ESGLLM via ESG-CID -- A Disclosure Content Index Finetuning Dataset for Mapping GRI and ESRS Paper • 2503.10674 • Published Mar 10, 2025 • 3
Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interaction Paper • 2605.05242 • Published about 1 month ago • 119
view article Article 🛡️ Nemotron PII: Synthesized Data for Privacy-Preserving AI nvidia • Oct 28, 2025 • 35
Next-Embedding Prediction Makes Strong Vision Learners Paper • 2512.16922 • Published Dec 18, 2025 • 90
LLM-JEPA: Large Language Models Meet Joint Embedding Predictive Architectures Paper • 2509.14252 • Published Sep 11, 2025 • 9
JEPA-Reasoner: Decoupling Latent Reasoning from Token Generation Paper • 2512.19171 • Published Dec 22, 2025 • 4
view article Article mmBERT: ModernBERT goes Multilingual +4 mmarone, orionweller, will-fleshman, eugene-yang, dlawrie, vandurme • Sep 9, 2025 • 146
Reinforcement Learning for Self-Improving Agent with Skill Library Paper • 2512.17102 • Published Dec 18, 2025 • 42
Running on CPU Upgrade 244 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 244 Explore synthetic data experiments on an interactive bookshelf
ORPO: Monolithic Preference Optimization without Reference Model Paper • 2403.07691 • Published Mar 12, 2024 • 73