·
AI & ML interests
NLP, OCR, AI
Recent Activity
reacted
to
scthornton's
post
with 👍
about 16 hours ago
SecureCode: security-aware code models (3B–20B), trained for review + remediation
I’ve been frustrated by how often code assistants recommend patterns that pass tests but fail security review (e.g., string-built SQL, brittle auth logic, unsafe parsing, insecure defaults, etc.). So I built **SecureCode**: a collection of **8 code models (3B → 20B)** trained to behave more like a security reviewer.
What you should expect from SecureCode:
- identify likely vuln patterns and explain *why* they’re risky
- outline plausible abuse paths (defensive framing)
- propose a secure rewrite (drop-in where possible)
- include defense-in-depth guidance + regression tests/checks
Links:
- **Models:** https://huggingface.co/collections/scthornton/securecode
- **Dataset:** https://huggingface.co/datasets/scthornton/securecode-v2
- **Paper:** https://arxiv.org/html/2512.18542v1 https://huggingface.co/papers/2512.18542
**How to test it (copy/paste prompt):**
```
> You are a senior application security engineer. Review the code below.
> Output: (1) findings with severity, (2) likely exploit scenarios (high level), (3) secure rewrite,
> (4) defense-in-depth recommendations, (5) regression tests/checks.
> Code: `...`
```
**I’m looking for real-world feedback**
- Your “this slipped through review once” snippets (sanitized is fine)
- False positives / false negatives you observe
- Contributions of new CVE-grounded examples
If you drop a snippet, please include language/framework + what the *correct* remediation looks like in your environment. If you have any contributions or suggestions for the dataset, I'd be happy to hear them. I have some new features and enhancements planned for v3 that are already underway, but for now, I'm focused on testing as many use cases as possible. Appreciate you all!
View all activity
Organizations
johnlockejrr/LightOnOCR-2-1B-base-samaritan-pre
Image-Text-to-Text
•
Updated
johnlockejrr/Nordic_Multicentury
Object Detection
•
Updated
•
4
•
1
johnlockejrr/marianmt_syr_voc_eastern
Translation
•
0.2B
•
Updated
johnlockejrr/marianmt_heb_voc
Translation
•
61.4M
•
Updated
•
1
johnlockejrr/marianmt_syr_voc_western
Translation
•
0.2B
•
Updated
johnlockejrr/marianmt-smp-sam-onnx
Translation
•
Updated
•
5
johnlockejrr/marianmt-smp-sam
Translation
•
61.4M
•
Updated
•
1
johnlockejrr/aramaic-diacritization-model
61.4M
•
Updated
•
1
johnlockejrr/opus-arc-targum-vocalization
61.4M
•
Updated
johnlockejrr/marianmt-he2arc-targum-voc-shva
Translation
•
61.4M
•
Updated
•
2
johnlockejrr/marianmt-he2arc-targum-voc
Translation
•
61.4M
•
Updated
•
1
johnlockejrr/marianmt-he2yid-tanakh
Translation
•
77.1M
•
Updated
•
1
johnlockejrr/marianmt-he2arc-targum
Translation
•
61.4M
•
Updated
•
1
•
1
johnlockejrr/marianmt-he2arc-sam
Translation
•
61.4M
•
Updated
•
1
johnlockejrr/marianmt-en2he-nwt
77.9M
•
Updated
•
2
•
1
johnlockejrr/marianmt-he2en-nwt
77.1M
•
Updated
•
3
•
1
johnlockejrr/opus-mt-arc-heb
77.1M
•
Updated
•
2
johnlockejrr/eynollah-sam_40_mss-patches
johnlockejrr/eynollah-sam_40_mss-no_patches
Updated
johnlockejrr/pylaia-heb_synth_lines_pytorch_2
Text Generation
•
Updated
johnlockejrr/pylaia-yiddish_synth_pytorch_2
Image-to-Text
•
Updated
•
1
johnlockejrr/hebrew_sefaria_tokenizer
Updated
johnlockejrr/hebrew_sefaria_preprocessor
Updated
johnlockejrr/yolo11_syr_synth
Updated
johnlockejrr/pylaya_syr_synth
Updated
johnlockejrr/norhand-regions-lines
Updated
johnlockejrr/yolo_samaritan
Updated
johnlockejrr/medieval-manuscript-yolov11-seg
Object Detection
•
Updated
johnlockejrr/pylaia_catmus_medieval
Image-to-Text
•
Updated
•
1
johnlockejrr/pylaia-heb_sam_v1
Image-to-Text
•
Updated