YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Green Patent Detection: Advanced Agentic Workflow with QLoRA

Project Summary

This is the final assignment it synthesizes Assignments 2 and 3 into a data labelling pipeline. A Generative LLM is fine-tuned via QLoRA to understand patent language, then integrated as the "jduge" of a Multi-Agent System (MAS) to debate and label complex patent claims. Finally, a targeted Human-in-the-Loop (HITL) review step produces a gold dataset for a final PatentSBERTa fine-tuning.


Pipeline Architecture

patents_50k_green.parquet

[Part A & B] Baseline PatentSBERTa + Uncertainty Sampling

  • Top 100 high-risk claims (u β‰ˆ 1.0)

[Part C – Step 1] QLoRA Fine-tuning on Colab (Qwen3-8B, 4-bit, 3 epochs)

  • qlora_green_patent_adapter (LoRA weights)
  • Qwen3-8B.Q4_K_M.gguf (served via LM Studio)

[Part C – Step 2] Multi-Agent System (CrewAI)

  • Agent 1 – Advocate (Qwen3-4B, argues for green: Advocator)
  • Agent 2 – Skeptic (Qwen3-4B, argues against green: Skeptic)
  • Agent 3 – Judge (QLoRA Qwen3-8B, final verdict: Judge)

[Part D] Exception-Based HITL (only deadlocks / low-confidence)

  • 26 claims reviewed with deadlock, 3 human overrides
  • hitl_green_100_final.csv (gold labels)

[Part D] Final PatentSBERTa Fine-tuning on gold dataset

  • patentsberta_finetuned_final/

Part C – Step 1: QLoRA Domain Adaptation

The generative LLM fine-tuning was performed on Google Colab (T4, 15 GB VRAM) using Unsloth's QLoRA implementation.

Parameter Value
Base model unsloth/Qwen3-8B-bnb-4bit
LoRA rank (r) 16
LoRA alpha 16
Target modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training examples 2,000 (train_silver, Alpaca format)
Epochs 3 (375 total steps)
Batch size 4 Γ— 4 gradient accumulation = effective 16
Learning rate 2e-4 (AdamW 8-bit, linear schedule)
Max sequence length 2,048 tokens
Training loss 0.8899
Training time ~105 minutes on T4
VRAM usage ~5 GB (4-bit quantization)

The fine-tuned adapter was exported to GGUF Q4_K_M format (4.682 GB) and served locally via LM Studio for use in the MAS.

Part C – Step 2: Multi-Agent System

Three agents collaborate to label each of the 100 high-risk patent claims using CrewAI as the orchestration framework. The QLoRA fine-tuned model serves as the Judge's brain.

Agent Model Temperature Role
Advocate Qwen3-4B (LM Studio) 0.1 Argues FOR Y02 green classification
Skeptic Qwen3-4B (LM Studio) 0.1 Argues AGAINST (identifies greenwashing)
Judge QLoRA Qwen3-8B (LM Studio) 0.1 Weighs debate and produces final JSON label

Each claim produces a structured JSON output: classification (0/1), confidence (Low/Medium/High), and rationale.

Part D: Targeted HITL & Final Fine-tuning

Exception-Based HITL was applied β€” only intervening when agents reached a deadlock or produced low-confidence outputs.

Metric Value
Total claims reviewed by MAS 100
Auto-accepted (high confidence) 74
Escalated to human review 26
Human overrides 3
Human agreement rate with Judge 88.5%

The gold-labelled dataset (hitl_green_100_final.csv) was used to fine-tune PatentSBERTa for 3 epochs using CosineSimilarityLoss on an AMD Radeon RX 9070 XT via DirectML (fell back to CPU, completed in ~31 minutes).

Performance Results

Model Version Training Data Source F1 Score (Test Set)
1. Baseline Frozen Embeddings (No Fine-tuning) 0.7494
2. Assignment 2 Model Silver + Gold (Simple Generic LLM) 0.7465
3. Assignment 3 Model Silver + Gold (Advanced Techniques / MAS) 0.7467
4. Final Model Silver + Gold (QLoRA-Powered MAS + Targeted HITL) 0.7530

The Final Model achieves the highest F1 score across all iterations, demonstrating that QLoRA domain adaptation combined with structured agent debate and targeted human review produces measurable improvements.

Key Findings

QLoRA advantages:

  • Adapts a generative LLM to patent language with only 0.53% of parameters trained
  • Enables a domain-aware Judge that understands Y02 classification logic
  • 4-bit quantization fits 8B model on a free 15 GB T4 GPU

MAS + HITL advantages:

  • Debate structure surfaces disagreements that single-model approaches miss
  • Exception-based HITL reduces human effort by 74% (26 vs 100 reviews)
  • Gold labels are higher-quality than silver LLM labels alone

Limitations:

  • DirectML (AMD GPU) not fully supported by sentence-transformers training β€” fell back to CPU
  • torchao 0.16.0 conflicts with transformers lazy loader in certain environments

Repository Contents

File Description
Final_Assignment.ipynb Main notebook (Parts A–D)
patentsberta_finetuned_final/ Final fine-tuned PatentSBERTa model
hitl_green_100_final.csv Gold dataset β€” 100 claims with HITL labels and debate rationales
final_classifier.joblib Serialised final Logistic Regression classifier
qlora_outputs.zip QLoRA adapter weights (qlora_green_patent_adapter/)
Part C Step 1.ipynb Colab notebook for QLoRA fine-tuning
Debate transcripts All debate transcrips for MAS

Related Repositories

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support