T AKHIL KUMAR REDDY's picture

T AKHIL KUMAR REDDY PRO

akhiilll

AI & ML interests

None yet

Recent Activity

reacted to theirpost with 🔥 1 day ago

Just shipped ClaimSense Adjudication Gym at OpenEnv Hackathon 2026 (Scaler India). An OpenEnv RL environment for enterprise insurance claims adjudication—the monthly “tool-heavy” workflow real adjusters do: pull policy + claim history, run fraud checks, verify purchase/transactions, then approve / deny / escalate under partial observability with long-horizon credit assignment. Trained Qwen/Qwen2.5-1.5B-Instruct with: Rollout evaluation on HF Jobs (A10G) and a random baseline for comparison Real GRPO weight updates (TRL GRPOTrainer) with LoRA adapters and two independent reward functions (format + env replay) Headline training evidence: GRPO run: 80 steps, 640 rollouts, KL rises ~0 → ~0.06 (real weight updates), completion length shrinks (~25 → ~10). Plots + logs are committed in the Space under runs/. Live demo + repo + writeup linked below. 🔗 Env (Space URL): https://huggingface.co/spaces/akhiilll/claims-env 🧪 Notebook: https://huggingface.co/spaces/akhiilll/claims-env/blob/main/training/InsureClaim_Training_Colab.ipynb 📝 Blog: docs/HF_MINI_BLOG.md in the Space

posted an update 2 days ago

Just shipped ClaimSense Adjudication Gym at OpenEnv Hackathon 2026 (Scaler India). An OpenEnv RL environment for enterprise insurance claims adjudication—the monthly “tool-heavy” workflow real adjusters do: pull policy + claim history, run fraud checks, verify purchase/transactions, then approve / deny / escalate under partial observability with long-horizon credit assignment. Trained Qwen/Qwen2.5-1.5B-Instruct with: Rollout evaluation on HF Jobs (A10G) and a random baseline for comparison Real GRPO weight updates (TRL GRPOTrainer) with LoRA adapters and two independent reward functions (format + env replay) Headline training evidence: GRPO run: 80 steps, 640 rollouts, KL rises ~0 → ~0.06 (real weight updates), completion length shrinks (~25 → ~10). Plots + logs are committed in the Space under runs/. Live demo + repo + writeup linked below. 🔗 Env (Space URL): https://huggingface.co/spaces/akhiilll/claims-env 🧪 Notebook: https://huggingface.co/spaces/akhiilll/claims-env/blob/main/training/InsureClaim_Training_Colab.ipynb 📝 Blog: docs/HF_MINI_BLOG.md in the Space

updated a model 2 days ago

akhiilll/claims-env-pro-grpo

View all activity

Organizations

None yet

akhiilll 's models 3

akhiilll/claims-env-pro-grpo

Text Generation • Updated 2 days ago • 6

akhiilll/forgeenv-repair-agent

Updated 2 days ago

akhiilll/forgeenv-source

Updated 3 days ago