ConicAI Coding LLM

Model Details

  • Model Name: ConicAI LLM Model
  • Developer: GIRISH KUMAR DEWANGAN
  • Base Model: Qwen/Qwen2.5-Coder-0.5B-Instruct
  • Architecture: Transformer (Causal LM)
  • Fine-tuning Method: LoRA (PEFT)
  • Task Domain: Code Generation, Debugging, Explanation
  • Primary Language: Python

Model Description

ConicAI Coding LLM is a parameter-efficient fine-tuned model optimized for structured coding tasks. It enhances the base model’s reasoning ability by introducing instruction-conditioned outputs and structured response generation.

The model focuses on three key aspects:

  • Accuracy → Correct code generation
  • Interpretability → Explanation + confidence
  • Efficiency → Lightweight fine-tuning

Core Design Philosophy

  1. Instruction Conditioning

    Instruction → Input → Output
    
  2. Structured Output Learning

  3. Post-Generation Validation Awareness


Capabilities

  • Code generation
  • Code debugging
  • Code explanation
  • Structured output generation
  • Confidence estimation
  • Hallucination detection

Output Schema

{
  "code": "string",
  "explanation": "string",
  "confidence": 0.0,
  "important_tokens": [],
  "relevancy_score": 0.0,
  "hallucination": false,
  "hallucination_check_reason": "",
  "latency_ms": 0
}

🧪 How to Use This Model (Colab / Local)

!pip -q install -U transformers peft accelerate huggingface_hub safetensors

from google.colab import userdata
HF_TOKEN = userdata.get('HF_TOKEN') 
model = "girish00/ConicAI_LLM_model"
prompt = input("Please enter your prompt: ")

from huggingface_hub import login, snapshot_download
login(token=HF_TOKEN)

repo = snapshot_download(model, token=HF_TOKEN)

import sys
sys.path.append(repo)

from infer_local import build_instruction_prompt, build_structured_result
from peft import PeftConfig, PeftModel
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch, time, json

cfg = PeftConfig.from_pretrained(repo)
base = cfg.base_model_name_or_path

tokenizer = AutoTokenizer.from_pretrained(base)
base_model = AutoModelForCausalLM.from_pretrained(
    base,
    torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
    device_map="auto"
)
llm = PeftModel.from_pretrained(base_model, repo)
llm.eval()

inputs = tokenizer(build_instruction_prompt(prompt), return_tensors="pt").to(llm.device)

start = time.perf_counter()
with torch.no_grad():
    out = llm.generate(
        **inputs,
        max_new_tokens=320,
        output_scores=True,
        return_dict_in_generate=True,
        do_sample=False,
        pad_token_id=tokenizer.eos_token_id
    )
latency = int((time.perf_counter() - start) * 1000)

gen_ids = out.sequences[0][inputs["input_ids"].shape[1]:].tolist()
text = tokenizer.decode(gen_ids, skip_special_tokens=True)

conf = []
for tid, score in zip(gen_ids, out.scores):
    probs = torch.softmax(score[0], dim=-1)
    conf.append(float(probs[tid].item()))

print(json.dumps(
    build_structured_result(
        prompt,
        text,
        latency,
        tokenizer=tokenizer,
        generated_ids=gen_ids,
        token_confidences=conf
    ),
    indent=2
))

Training Details

Dataset

  • Size: ~5K–10K samples
  • Instruction-based coding dataset

Training Procedure

  • Method: LoRA fine-tuning
  • Framework: Transformers + PEFT
  • Precision: FP16 / Mixed

Training Hyperparameters

Parameter Value
Epochs 1–3
Batch Size 2
Learning Rate 2e-4
Max Sequence Length 512
LoRA Rank (r) 8
LoRA Alpha 16
LoRA Dropout 0.05

Inference Configuration

max_new_tokens = 200
temperature = 0.2
top_p = 0.9
do_sample = True

Evaluation

  • Syntax validation
  • Prompt-based testing
  • Relevancy scoring
  • Hallucination detection

Strengths

  • Lightweight and efficient
  • Strong performance on structured prompts
  • Generates readable and correct Python code
  • Provides reasoning-aware outputs

Limitations

  • Sensitive to prompt structure
  • Confidence scores are heuristic
  • Limited generalization beyond training dataset

Risks & Considerations

  • Always validate generated code
  • Not suitable for critical production systems
  • May produce incorrect logic

Best Practices

Instruction:
Input:
Output:
  • Use clear and specific prompts
  • Keep temperature low for reliability
  • Apply post-processing for cleaner output

Technical Specifications

  • Transformer-based causal language model
  • LoRA adaptation on attention layers
  • Hugging Face Transformers + PEFT

Environmental Impact

  • Uses parameter-efficient fine-tuning
  • Lower compute compared to full fine-tuning
  • Suitable for local deployment

Intended Use

Direct Use

  • Coding assistant
  • Debugging support
  • Learning programming

Out-of-Scope Use

  • Security-critical systems
  • Autonomous production deployment

Author

GIRISH KUMAR DEWANGAN


License

Apache License 2.0


Downloads last month
431
Safetensors
Model size
0.5B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for girish00/ConicAI_LLM_model

Adapter
(31)
this model