Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
T AKHIL KUMAR REDDY
PRO
akhiilll
Follow
AI & ML interests
None yet
Recent Activity
reacted
to
their
post
with 🔥
1 day ago
Just shipped ClaimSense Adjudication Gym at OpenEnv Hackathon 2026 (Scaler India). An OpenEnv RL environment for enterprise insurance claims adjudication—the monthly “tool-heavy” workflow real adjusters do: pull policy + claim history, run fraud checks, verify purchase/transactions, then approve / deny / escalate under partial observability with long-horizon credit assignment. Trained Qwen/Qwen2.5-1.5B-Instruct with: Rollout evaluation on HF Jobs (A10G) and a random baseline for comparison Real GRPO weight updates (TRL GRPOTrainer) with LoRA adapters and two independent reward functions (format + env replay) Headline training evidence: GRPO run: 80 steps, 640 rollouts, KL rises ~0 → ~0.06 (real weight updates), completion length shrinks (~25 → ~10). Plots + logs are committed in the Space under runs/. Live demo + repo + writeup linked below. 🔗 Env (Space URL): https://huggingface.co/spaces/akhiilll/claims-env 🧪 Notebook: https://huggingface.co/spaces/akhiilll/claims-env/blob/main/training/InsureClaim_Training_Colab.ipynb 📝 Blog: docs/HF_MINI_BLOG.md in the Space
posted
an
update
2 days ago
Just shipped ClaimSense Adjudication Gym at OpenEnv Hackathon 2026 (Scaler India). An OpenEnv RL environment for enterprise insurance claims adjudication—the monthly “tool-heavy” workflow real adjusters do: pull policy + claim history, run fraud checks, verify purchase/transactions, then approve / deny / escalate under partial observability with long-horizon credit assignment. Trained Qwen/Qwen2.5-1.5B-Instruct with: Rollout evaluation on HF Jobs (A10G) and a random baseline for comparison Real GRPO weight updates (TRL GRPOTrainer) with LoRA adapters and two independent reward functions (format + env replay) Headline training evidence: GRPO run: 80 steps, 640 rollouts, KL rises ~0 → ~0.06 (real weight updates), completion length shrinks (~25 → ~10). Plots + logs are committed in the Space under runs/. Live demo + repo + writeup linked below. 🔗 Env (Space URL): https://huggingface.co/spaces/akhiilll/claims-env 🧪 Notebook: https://huggingface.co/spaces/akhiilll/claims-env/blob/main/training/InsureClaim_Training_Colab.ipynb 📝 Blog: docs/HF_MINI_BLOG.md in the Space
updated
a model
2 days ago
akhiilll/claims-env-pro-grpo
View all activity
Organizations
None yet
akhiilll
's models
3
Sort: Recently updated
akhiilll/claims-env-pro-grpo
Text Generation
•
Updated
2 days ago
•
6
akhiilll/forgeenv-repair-agent
Updated
2 days ago
akhiilll/forgeenv-source
Updated
3 days ago