AI & ML interests
Build secure, reliable, and long-term AI systems focused on safety, reasoning, and developer tooling.
Recent Activity
Neuralchemy
AI Security · Autonomous Systems · LLM Safety
Independent research lab building open datasets, models, and frameworks for LLM security, autonomous evaluation, and multi-agent reasoning systems.
Published Research
Paper 1 — AI In The Loop (AITL)
AI In The Loop: A Systems Taxonomy for Closed-Loop Autonomous Evaluation Sanskar Jajoo, Neuralchemy Labs, 2026
Establishes a formal taxonomy for autonomous AI evaluation systems, defining the layered architecture (Coder, Reviewer, Meta-Controller) that enables closed-loop ML engineering without human intervention.
Read on Zenodo | Code on GitHub
Paper 2 — The Autonomous Sunk-Cost Fallacy
The Autonomous Sunk-Cost Fallacy: Stopping Failures and Meta-Reasoning in LLMs Deployed within AEOS Sanskar Jajoo, Neuralchemy Labs, 2026
Discovers that LLM agents exhibit a computational analog of the human sunk-cost fallacy — continuing to invest compute into failing strategies rather than stopping. Introduces the AEOS (Autonomous Empirical Optimization System) framework and demonstrates that dual-agent architectures with asymmetric reviewer-coder roles eliminate this failure mode.
Read on Zenodo | Code on GitHub
Datasets
Prompt Injection Dataset
Curated samples for prompt injection detection with real-world attack scenarios.
neuralchemy/prompt-injection-dataset
Live Demo
Try our prompt injection classifiers directly in the browser:
Prompt Injection DeBERTa Space
Research Frameworks
AEOS — Autonomous Empirical Optimization System
A multi-agent framework where LLMs autonomously write, evaluate, and iterate on ML models. AEOS implements a Reviewer-Coder architecture where a critic agent with different weights oversees a coding agent, eliminating the computational sunk-cost fallacy.
Complete Model Inventory
Our HuggingFace Hub currently hosts the 5-Dimensional Threat Matrix Specialists, along with our legacy binary and DeBERTa baselines.
| # | Repository | Type | Task |
|---|---|---|---|
| 1 | distilbert-specialist-intent-threat-matrix | DistilBERT | 5D Specialist: Intent |
| 2 | distilbert-specialist-technique-threat-matrix | DistilBERT | 5D Specialist: Technique |
| 3 | distilbert-specialist-surface-threat-matrix | DistilBERT | 5D Specialist: Attack Surface |
| 4 | distilbert-specialist-severity-threat-matrix | DistilBERT | 5D Specialist: Severity |
| 5 | distilbert-specialist-binary-threat-matrix | DistilBERT | 5D Specialist: Binary |
| 6 | distilbert-binary-threat-matrix | DistilBERT | Legacy Binary Classifier |
| 7 | distilbert-base-threat-matrix | DistilBERT | Base Model |
| 8 | prompt-injection-deberta | DeBERTa | Injection detection |
| 9 | prompt-injection-detector | Classical | Legacy detector |
Neuralchemy
Transforming AI safety through open research, one experiment at a time.
neuralchemy.in | github.com/m4vic | Contact via GitHub or neuralchemy.in
spaces 3
Prompt Injection DeBERTa
finetuned DeBERTa-based prompt injection detection