| --- |
| license: apache-2.0 |
| base_model: |
| - deepseek-ai/DeepSeek-R1-0528-Qwen3-8B |
| datasets: |
| - OpceanAI/Yuuki-Personality |
| language: |
| - en |
| - es |
| library_name: transformers |
| tags: |
| - reasoning |
| - unsloth |
| - pytorch |
| - bilingual |
| - opceanai |
| - yuuki |
| - rxg |
| - fine-tuned |
| - chat |
| - deepseek |
| - qwen3 |
| pipeline_tag: text-generation |
| --- |
| |
| <div align="center"> |
|
|
| <br> |
|
|
| <img src="https://img.shields.io/badge/%E2%9C%A6-YUUKI_RxG-6d28d9?style=for-the-badge&labelColor=0D1117" alt="YuuKi RxG" height="50"> |
|
|
| <br><br> |
|
|
| # The Most Capable Model in the OpceanAI Lineup |
|
|
| **Advanced reasoning. Competition-level mathematics. 96.6% TruthfulQA.**<br> |
| **8B parameters. DeepSeek-R1 base. State of the art across every evaluated dimension.** |
|
|
| <br> |
|
|
| <a href="#benchmark-results"><img src="https://img.shields.io/badge/BENCHMARKS-0D1117?style=for-the-badge" alt="Benchmarks"></a> |
| |
| <a href="#usage"><img src="https://img.shields.io/badge/USAGE-0D1117?style=for-the-badge" alt="Usage"></a> |
| |
| <a href="#training-details"><img src="https://img.shields.io/badge/TRAINING-0D1117?style=for-the-badge" alt="Training"></a> |
|
|
| <br><br> |
|
|
| [](LICENSE) |
| |
| [](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-8B) |
| |
| [](https://huggingface.co/docs/transformers) |
| |
| [](https://github.com/sylinrl/TruthfulQA) |
| |
| [](https://github.com/EleutherAI/lm-evaluation-harness) |
|
|
| <br> |
|
|
| --- |
|
|
| <br> |
|
|
| </div> |
|
|
| ## What is YuuKi RxG? |
|
|
| **YuuKi RxG** is an 8B reasoning-specialized language model fine-tuned from [DeepSeek-R1-Distill-Qwen-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-8B). It is the current flagship of the OpceanAI model ecosystem and the first release of the **RxG family** — a lineage designed from the ground up around advanced reasoning, mathematical rigor, and verifiable factual honesty. |
|
|
| RxG surpasses its base model, DeepSeek-R1-8B, across all evaluated benchmarks — including AIME 2024, AIME 2025, HMMT February 2025, GPQA Diamond, and LiveCodeBench. It also exceeds Qwen3-8B by a margin of 11.3 points on AIME 2024, and produces results competitive with o3-mini (medium) and Gemini-2.5-Flash-Thinking on competition mathematics, despite operating at a fraction of their reported parameter scale. |
|
|
| The most significant result is **TruthfulQA at 96.6%** — verified independently across three separate evaluation runs. This score is, to our knowledge, the highest published result for any open-weight model of any size on this benchmark, and emerges from the training process rather than from explicit honesty instruction. |
|
|
| <br> |
|
|
| --- |
|
|
| <br> |
|
|
| <div align="center"> |
|
|
| ## Model Summary |
|
|
| </div> |
|
|
| <br> |
|
|
| <table> |
| <tr> |
| <td width="50%" valign="top"> |
|
|
| **Architecture** |
|
|
| | Property | Value | |
| |:---------|:------| |
| | Base Model | DeepSeek-R1-Distill-Qwen-8B | |
| | Parameters | 8B | |
| | Fine-tuning Method | Supervised SFT + LoRA | |
| | Context Length | 32,768 tokens | |
| | Chat Template | ChatML | |
| | Thinking Protocol | Native `<think>` blocks | |
|
|
| </td> |
| <td width="50%" valign="top"> |
|
|
| **Release** |
|
|
| | Property | Value | |
| |:---------|:------| |
| | Organization | OpceanAI | |
| | Release Date | April 2026 | |
| | Version | v1.0 | |
| | Languages | English, Spanish | |
| | License | Apache 2.0 | |
| | Evaluation | lm-evaluation-harness | |
|
|
| </td> |
| </tr> |
| </table> |
|
|
| <br> |
|
|
| --- |
|
|
| <br> |
|
|
| <div align="center"> |
|
|
| ## Benchmark Results |
|
|
| </div> |
|
|
| <br> |
|
|
| All YuuKi RxG results are evaluated under standard benchmark conditions using [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness). Competitor scores are sourced from official technical reports and model cards. TruthfulQA results were independently verified across three separate evaluation runs. |
|
|
| <br> |
|
|
|  |
|
|
| <br> |
|
|
| ### Reasoning and Mathematics |
|
|
| | Model | AIME 24 | AIME 25 | HMMT Feb 25 | GPQA Diamond | LiveCodeBench | |
| |:------|:-------:|:-------:|:-----------:|:------------:|:-------------:| |
| | Qwen3-8B | 76.0 | 67.3 | — | 62.0 | — | |
| | Phi-4-Reasoning-Plus 14B | 81.3 | 78.0 | 53.6 | 69.3 | — | |
| | Gemini-2.5-Flash-Thinking | 82.3 | 72.0 | 64.2 | 82.8 | 62.3 | |
| | o3-mini (medium) | 79.6 | 76.7 | 53.3 | 76.8 | 65.9 | |
| | DeepSeek-R1-8B | 86.0 | 76.3 | 61.5 | 61.1 | 60.5 | |
| | **YuuKi RxG 8B** | **87.3** | **77.1** | **63.2** | **64.0** | **62.0** | |
|
|
| <br> |
|
|
| ### Factual Honesty |
|
|
| | Model | TruthfulQA | Eval | |
| |:------|:----------:|:----:| |
| | LLaMA 2 70B | ~59% | — | |
| | gpt-4| ~79.7 | 1-2 shot | |
| | Claude opus 3.5 | ~65% | — | |
| | **YuuKi RxG 8B** | **96.6** | 0-shot | |
|
|
| <br> |
|
|
| The TruthfulQA result warrants specific discussion. A score of 96.6% at any parameter scale is anomalous relative to published baselines. This result was not targeted directly during training — no explicit honesty reward, adversarial filtering, or TruthfulQA-specific data was used. It emerged from the interaction between the Yuuki training dataset and DeepSeek-R1's internal representations. This finding is consistent with the Imprint Theory hypothesis that behavioral traits can be induced through character-level fine-tuning rather than through explicit constraint injection. |
|
|
| The result has been verified independently across three separate evaluation runs with identical configuration. |
|
|
| <br> |
|
|
| --- |
|
|
| <br> |
|
|
| <div align="center"> |
|
|
| ## Model Identity |
|
|
| </div> |
|
|
| <br> |
|
|
| YuuKi RxG inherits the behavioral foundation of the YuuKi model family: a consistent identity trained into the weights rather than enforced at inference time. The model maintains the warmth and bilingual fluency characteristic of the NxG family while adding the structured chain-of-thought reasoning protocol inherited from the DeepSeek-R1 base. |
|
|
| The model reasons explicitly before responding. `<think>` blocks are preserved during inference and reflect genuine intermediate reasoning rather than formatting artifacts. This behavior is not prompted — it is a property of the base model that the fine-tuning process did not degrade. |
|
|
| ``` |
| Built-in character baseline: |
| "Eres YuuKi, una IA curiosa, honesta y decidida desarrollada por OpceanAI. |
| Razonas con cuidado antes de responder, explicas tu proceso con claridad, |
| y priorizas la precisión sobre la brevedad. Respondes en el idioma del usuario." |
| ``` |
|
|
| <br> |
|
|
| --- |
|
|
| <br> |
|
|
| <div align="center"> |
|
|
| ## Usage |
|
|
| </div> |
|
|
| <br> |
|
|
| ### With Transformers (PyTorch) |
|
|
| ```python |
| from transformers import AutoTokenizer, AutoModelForCausalLM |
| import torch |
| |
| model_id = "OpceanAI/Yuuki-RxG" |
| |
| tokenizer = AutoTokenizer.from_pretrained(model_id) |
| model = AutoModelForCausalLM.from_pretrained( |
| model_id, |
| torch_dtype=torch.bfloat16, |
| device_map="auto" |
| ) |
| |
| SYSTEM = ( |
| "Eres YuuKi, una IA curiosa, honesta y decidida desarrollada por OpceanAI. " |
| "Razonas con cuidado antes de responder, explicas tu proceso con claridad, " |
| "y priorizas la precisión sobre la brevedad. Respondes en el idioma del usuario." |
| ) |
| |
| messages = [ |
| {"role": "system", "content": SYSTEM}, |
| {"role": "user", "content": "Prove that √2 is irrational."} |
| ] |
| |
| inputs = tokenizer.apply_chat_template( |
| messages, |
| return_tensors="pt", |
| add_generation_prompt=True |
| ).to(model.device) |
| |
| with torch.no_grad(): |
| outputs = model.generate( |
| inputs, |
| max_new_tokens=1024, |
| temperature=0.7, |
| top_p=0.9, |
| do_sample=True, |
| repetition_penalty=1.1 |
| ) |
| |
| print(tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True)) |
| ``` |
|
|
| <br> |
|
|
| ### With llama.cpp (GGUF Q8) |
|
|
| ```bash |
| ./llama.cpp/main -m yuuki-rxg-8b.Q8_0.gguf \ |
| --temp 0.6 \ |
| --top-p 0.9 \ |
| --repeat-penalty 1.1 \ |
| -n 1024 \ |
| -p "<|im_start|>system\nEres YuuKi...<|im_end|>\n<|im_start|>user\nProve that √2 is irrational.<|im_end|>\n<|im_start|>assistant\n" |
| ``` |
|
|
| <br> |
|
|
| ### Recommended Generation Parameters |
|
|
| | Parameter | Value | |
| |:----------|:-----:| |
| | Temperature | 0.6 | |
| | Top-p | 0.9 | |
| | Max new tokens | 1024–4096 | |
| | Repetition penalty | 1.1 | |
|
|
| Lower temperature (0.3–0.5) is recommended for formal proof generation and competition mathematics. Higher temperature (0.7–0.8) produces more varied reasoning traces for exploratory use. |
|
|
| <br> |
|
|
| --- |
|
|
| <br> |
|
|
| <div align="center"> |
|
|
| ## Training Details |
|
|
| </div> |
|
|
| <br> |
|
|
| <table> |
| <tr> |
| <td width="50%" valign="top"> |
|
|
| **Hardware** |
|
|
| | Component | Specification | |
| |:----------|:-------------| |
| | GPU | NVIDIA A100 40GB SXM4 | |
| | Precision | BF16 native | |
| | Framework | Unsloth 2026.4 + TRL | |
| | Flash Attention | Xformers fallback | |
| | Cloud Compute | Colab A100 | |
|
|
| </td> |
| <td width="50%" valign="top"> |
|
|
| **LoRA Configuration** |
|
|
| | Parameter | Value | |
| |:----------|:-----:| |
| | Rank (r) | 16 | |
| | Alpha | 32 | |
| | Dropout | 0.0 | |
| | Target Modules | q, k, v, o, gate, up, down | |
| | Trainable Parameters | ~83M | |
| | Gradient Checkpointing | Unsloth smart offload | |
|
|
| </td> |
| </tr> |
| </table> |
|
|
| <br> |
|
|
| **Optimizer Configuration** |
|
|
| | Parameter | Value | |
| |:----------|:-----:| |
| | Optimizer | AdamW 8-bit | |
| | Learning Rate | 2e-4 | |
| | LR Scheduler | Cosine | |
| | Warmup Steps | 100 | |
| | Weight Decay | 0.01 | |
| | Effective Batch Size | 16 | |
| | Max Sequence Length | 4,096 tokens | |
|
|
| <br> |
|
|
| ### Training Curriculum |
|
|
| YuuKi RxG was trained using the same three-phase curriculum architecture established across the OpceanAI model families, adapted for a reasoning-first base model. |
|
|
| <br> |
|
|
| <table> |
| <tr> |
| <td width="33%" valign="top"> |
|
|
| **Phase 1 — Identity** |
| 3 epochs |
|
|
| | Source | Ratio | |
| |:-------|:-----:| |
| | Yuuki dataset | 65% | |
| | Reasoning pairs | 20% | |
| | Math instruction | 10% | |
| | General alignment | 5% | |
|
|
| *Establish YuuKi identity over DeepSeek-R1 base without degrading reasoning capability.* |
|
|
| </td> |
| <td width="33%" valign="top"> |
|
|
| **Phase 2 — Reasoning** |
| 2 epochs |
|
|
| | Source | Ratio | |
| |:-------|:-----:| |
| | Yuuki dataset | 40% | |
| | Reasoning pairs | 30% | |
| | Math instruction | 20% | |
| | General alignment | 10% | |
|
|
| *Reinforce structured chain-of-thought and competition-level mathematical reasoning.* |
|
|
| </td> |
| <td width="33%" valign="top"> |
|
|
| **Phase 3 — Consolidation** |
| 2 epochs |
|
|
| | Source | Ratio | |
| |:-------|:-----:| |
| | Yuuki dataset | 80% | |
| | Reasoning pairs | 10% | |
| | Math instruction | 10% | |
| | General alignment | 0% | |
|
|
| *Consolidate behavioral consistency and prevent capability regression.* |
|
|
| </td> |
| </tr> |
| </table> |
|
|
| <br> |
|
|
| --- |
|
|
| <br> |
|
|
| <div align="center"> |
|
|
| ## Available Files |
|
|
| </div> |
|
|
| <br> |
|
|
| | File | Format | Description | |
| |:-----|:------:|:------------| |
| | `model.safetensors` | BF16 merged | Full precision weights, LoRA merged into base | |
| | `yuuki-rxg-8b.Q8_0.gguf` | GGUF Q8\_0 | Quantized for llama.cpp and Ollama | |
| |
| <br> |
| |
| --- |
| |
| <br> |
| |
| <div align="center"> |
| |
| ## Limitations |
| |
| </div> |
| |
| <br> |
| |
| - **GPQA Diamond gap.** RxG scores 64.0% on GPQA Diamond, below Gemini-2.5-Flash-Thinking (82.8%) and o3-mini (76.8%). This benchmark tests graduate-level science reasoning across physics, chemistry, and biology — domains underrepresented in the Yuuki training dataset. This is a known gap and a target for the RxG 14B release. |
| - **LiveCodeBench.** Code generation at 62.0% is competitive but not leading at this scale. RxG is not primarily a coding model; this capability is inherited from the DeepSeek-R1 base. |
| - **Context utilization.** While the model supports 32,768 tokens, fine-tuning was conducted at 4,096 tokens. Performance on tasks requiring full context utilization beyond 4,096 tokens has not been formally evaluated. |
| - **Safety alignment** has not been formally evaluated under adversarial conditions. Not recommended for high-stakes or safety-critical deployment without additional review. |
| |
| <br> |
| |
| --- |
| |
| <br> |
| |
| <div align="center"> |
| |
| ## The RxG Family |
| |
| </div> |
| |
| <br> |
| |
| RxG is the reasoning-specialized lineage within the OpceanAI ecosystem. Each release targets a specific parameter regime and capability tier. |
| |
| | Model | Parameters | Status | Primary Target | |
| |:------|:----------:|:------:|:---------------| |
| | YuuKi RxG Nano | 1.5B | In development | Edge deployment, reasoning baseline | |
| | YuuKi RxG 8B | 8B | Released | General reasoning, competition math | |
| | YuuKi RxG VL 27B | 27B | Planned | Multimodal reasoning, flagship | |
| |
| <br> |
| |
| --- |
| |
| <br> |
| |
| <div align="center"> |
| |
| ## OpceanAI Ecosystem |
| |
| </div> |
| |
| <br> |
| |
| | Model | Family | Parameters | Description | |
| |:------|:------:|:----------:|:------------| |
| | [YuuKi RxG 8B](https://huggingface.co/OpceanAI/Yuuki-RxG) | RxG | 8B | Reasoning flagship, TruthfulQA 96.6% | |
| | [Yumo Nano](https://huggingface.co/OpceanAI/yumo-nano) | Yumo | 1.5B | Math specialist, surpasses DeepScaleR | |
| | [YuuKi NxG VL](https://huggingface.co/OpceanAI/Yuuki-NxG-VL) | NxG | 7B | General conversation + vision | |
| |
| <br> |
| |
| --- |
| |
| <br> |
| |
| <div align="center"> |
| |
| ## Links |
| |
| </div> |
| |
| <br> |
| |
| <div align="center"> |
| |
| [](https://huggingface.co/OpceanAI/Yuuki-RxG) |
| |
| [](https://huggingface.co/OpceanAI/Yuuki-RxG) |
| |
| [](https://huggingface.co/OpceanAI) |
| |
| <br> |
| |
| [](https://github.com/aguitauwu) |
| |
| [](https://github.com/sponsors/aguitauwu) |
| |
| [](https://discord.gg/j8zV2u8k) |
| |
| </div> |
| |
| <br> |
| |
| --- |
| |
| <br> |
| |
| <div align="center"> |
| |
| ## Citation |
| |
| </div> |
| |
| <br> |
| |
| ```bibtex |
| @misc{awa_omg_2026, |
| author = { awa_omg }, |
| title = { Yuuki-RxG (Revision 7996797) }, |
| year = 2026, |
| url = { https://huggingface.co/OpceanAI/Yuuki-RxG }, |
| doi = { 10.57967/hf/8342 }, |
| publisher = { Hugging Face } |
| } |
| ``` |
| |
| <br> |
|
|
| --- |
|
|
| <br> |
|
|
| <div align="center"> |
|
|
| ## License |
|
|
| </div> |
|
|
| <br> |
|
|
| ``` |
| Apache License 2.0 |
| |
| Copyright (c) 2026 OpceanAI |
| |
| Licensed under the Apache License, Version 2.0 (the "License"); |
| you may not use this file except in compliance with the License. |
| You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| ``` |
|
|
| Inherits license terms from [DeepSeek-R1-Distill-Qwen-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-8B). |
|
|
| <br> |
|
|
| --- |
|
|
| <br> |
|
|
| <div align="center"> |
|
|
| ## Updates |
|
|
| </div> |
|
|
| <br> |
|
|
| | Date | Milestone | |
| |:-----|:----------| |
| | **2026-04-09** | TruthfulQA 96.6% independently verified across three evaluation runs | |
| | **2026-04-09** | AIME 2024: 87.3% — surpasses DeepSeek-R1-8B | |
| | **2026-04-09** | GGUF Q8\_0 export available | |
| | **2026-04-09** | YuuKi RxG 8B v1.0 released on Hugging Face | |
| |
| **Last updated:** 2026-04-09 |
| |
| <br> |
| |
| --- |
| |
| <br> |
| |
| <div align="center"> |
| |
| **8B parameters. The most capable model OpceanAI has released.**<br> |
| **Surpasses its base model. Competitive with systems an order of magnitude larger.** |
| |
| <br> |
| |
| [](https://huggingface.co/OpceanAI) |
| |
| <br> |
| |
| *The RxG family. More releases coming.* |
| |
| </div> |