This comment has been hidden
jiayanmu
wangbuer999
AI & ML interests
None yet
Organizations
repliedto their post about 2 months ago
repliedto their post 2 months ago
Ran it locally, with the environment,diffusers==0.26.3 + CUDA 12.1. The weights are in the link I posted just click in and you can get them
Post
2647
HunyuanImage 3.0-Instruct just dropped
fresh -sourceImage 3.0model! Spent 20 mins testing it on a Messi + retro scrambler fusion case
Ran on diffusers v0.26.3 + CUDA 12.1 | 8B MoE params (1.3B activated) | zero VRAM issues
strength=0.9 Messi #10 kit/tattoo sharp, moto’s rusted metal texture blurred (classic open-source pain)
strength=0.7 Moto/cobblestone background crisp, Messi’s jersey details faded completely
strength=0.75 + prompt "Blend seamlessly, keep all original details": both subject & background sharp
No ControlNet, no manual masking the model’s chain-of-thought reasoning parses image+prompt first
Already outperforms Qwen-Image-Edit 2511 (GSB eval +25.7% on single-image edits) | 100% open-source
👉 Repo: https://hunyuan.tencent.com/chat/HunyuanDefault?from=modelSquare&modelId=Hunyuan-Image-3.0-Instruct
technical report:https://arxiv.org/abs/2509.23951
Anyone else struggled with strength tweaks for fusion? This fixed it for my Messi+moto case did it work as well for yours?
fresh -sourceImage 3.0model! Spent 20 mins testing it on a Messi + retro scrambler fusion case
Ran on diffusers v0.26.3 + CUDA 12.1 | 8B MoE params (1.3B activated) | zero VRAM issues
strength=0.9 Messi #10 kit/tattoo sharp, moto’s rusted metal texture blurred (classic open-source pain)
strength=0.7 Moto/cobblestone background crisp, Messi’s jersey details faded completely
strength=0.75 + prompt "Blend seamlessly, keep all original details": both subject & background sharp
No ControlNet, no manual masking the model’s chain-of-thought reasoning parses image+prompt first
Already outperforms Qwen-Image-Edit 2511 (GSB eval +25.7% on single-image edits) | 100% open-source
👉 Repo: https://hunyuan.tencent.com/chat/HunyuanDefault?from=modelSquare&modelId=Hunyuan-Image-3.0-Instruct
technical report:https://arxiv.org/abs/2509.23951
Anyone else struggled with strength tweaks for fusion? This fixed it for my Messi+moto case did it work as well for yours?
reactedto Benedictat's post with 🔥 2 months ago
Post
2762
Tencent HunyuanImage 3.0-Instruct is seriously impressive
skyrocketed to 2nd place globally on the LMArena leaderboard, only trailing Google Nano-banana Pro.
What excites me most is its newly launched image editing and multi-image fusion capabilities
its semantic understanding is rock-solid this Instruct-following capability basically enables one-sentence end-to-end workflows, delivering a dimensionality-reducing boost in efficiency.
Frankly, it nails the pain points of frontline creators: old photo restoration, text modification, even extracting people from multiple images to create group shots. Previously, tweaking the fusion quality took tons of effort, but now the out-of-the-box realism and emotional expression are top-tier zero cheap AI artifacts
👉 Repo: https://hunyuan.tencent.com/chat/HunyuanDefault?from=modelSquare&modelId=Hunyuan-Image-3.0-Instruct
technical report:https://arxiv.org/abs/2509.23951
skyrocketed to 2nd place globally on the LMArena leaderboard, only trailing Google Nano-banana Pro.
What excites me most is its newly launched image editing and multi-image fusion capabilities
its semantic understanding is rock-solid this Instruct-following capability basically enables one-sentence end-to-end workflows, delivering a dimensionality-reducing boost in efficiency.
Frankly, it nails the pain points of frontline creators: old photo restoration, text modification, even extracting people from multiple images to create group shots. Previously, tweaking the fusion quality took tons of effort, but now the out-of-the-box realism and emotional expression are top-tier zero cheap AI artifacts
👉 Repo: https://hunyuan.tencent.com/chat/HunyuanDefault?from=modelSquare&modelId=Hunyuan-Image-3.0-Instruct
technical report:https://arxiv.org/abs/2509.23951
reactedto kelsend's post with 🔥 2 months ago
Post
2642
I'm absolutely stunned by the aesthetics of HunyuanImage-3.0
The visual effects of this model are simply beyond imagination it’s every bit as good as NanoBanana, no compromise at all.
I fine-tuned my micro-scene prompts by adding text overlays and background effects, and its adaptability is truly breathtaking. With just one prompt, you can generate scene posters for any movie or novel.
Every detail, from scene design to text style and atmospheric effects, perfectly aligns with the tone of the original material.
No forced elements, just seamless, film-grade visual effects that exactly match what I envisioned.
👉 Repo: https://hunyuan.tencent.com/chat/HunyuanDefault?from=modelSquare&modelId=Hunyuan-Image-3.0-Instruct
The visual effects of this model are simply beyond imagination it’s every bit as good as NanoBanana, no compromise at all.
I fine-tuned my micro-scene prompts by adding text overlays and background effects, and its adaptability is truly breathtaking. With just one prompt, you can generate scene posters for any movie or novel.
Every detail, from scene design to text style and atmospheric effects, perfectly aligns with the tone of the original material.
No forced elements, just seamless, film-grade visual effects that exactly match what I envisioned.
👉 Repo: https://hunyuan.tencent.com/chat/HunyuanDefault?from=modelSquare&modelId=Hunyuan-Image-3.0-Instruct
posted an update 2 months ago
Post
2647
HunyuanImage 3.0-Instruct just dropped
fresh -sourceImage 3.0model! Spent 20 mins testing it on a Messi + retro scrambler fusion case
Ran on diffusers v0.26.3 + CUDA 12.1 | 8B MoE params (1.3B activated) | zero VRAM issues
strength=0.9 Messi #10 kit/tattoo sharp, moto’s rusted metal texture blurred (classic open-source pain)
strength=0.7 Moto/cobblestone background crisp, Messi’s jersey details faded completely
strength=0.75 + prompt "Blend seamlessly, keep all original details": both subject & background sharp
No ControlNet, no manual masking the model’s chain-of-thought reasoning parses image+prompt first
Already outperforms Qwen-Image-Edit 2511 (GSB eval +25.7% on single-image edits) | 100% open-source
👉 Repo: https://hunyuan.tencent.com/chat/HunyuanDefault?from=modelSquare&modelId=Hunyuan-Image-3.0-Instruct
technical report:https://arxiv.org/abs/2509.23951
Anyone else struggled with strength tweaks for fusion? This fixed it for my Messi+moto case did it work as well for yours?
fresh -sourceImage 3.0model! Spent 20 mins testing it on a Messi + retro scrambler fusion case
Ran on diffusers v0.26.3 + CUDA 12.1 | 8B MoE params (1.3B activated) | zero VRAM issues
strength=0.9 Messi #10 kit/tattoo sharp, moto’s rusted metal texture blurred (classic open-source pain)
strength=0.7 Moto/cobblestone background crisp, Messi’s jersey details faded completely
strength=0.75 + prompt "Blend seamlessly, keep all original details": both subject & background sharp
No ControlNet, no manual masking the model’s chain-of-thought reasoning parses image+prompt first
Already outperforms Qwen-Image-Edit 2511 (GSB eval +25.7% on single-image edits) | 100% open-source
👉 Repo: https://hunyuan.tencent.com/chat/HunyuanDefault?from=modelSquare&modelId=Hunyuan-Image-3.0-Instruct
technical report:https://arxiv.org/abs/2509.23951
Anyone else struggled with strength tweaks for fusion? This fixed it for my Messi+moto case did it work as well for yours?
reactedto zc277584121's post with 🔥 2 months ago
Post
3166
We've open-sourced a bilingual Semantic Highlighting model that can power multiple production scenarios:
1) RAG Answer Highlighting — Automatically highlight the exact sentences that answer user queries, improving interpretability and helping users quickly locate relevant information.
2) RAG Noise Filtering — Prune irrelevant context before sending to LLMs, achieving 70-80% token cost reduction while improving answer quality by letting the model focus on what matters.
3) Search System Highlighting — Add semantic highlighting features to recommendation systems, e-commerce search, or any retrieval system where users need to see why a result is relevant.
Try it out: zilliz/semantic-highlight-bilingual-v1
Read our article: https://huggingface.co/blog/zilliz/zilliz-semantic-highlight-model
1) RAG Answer Highlighting — Automatically highlight the exact sentences that answer user queries, improving interpretability and helping users quickly locate relevant information.
2) RAG Noise Filtering — Prune irrelevant context before sending to LLMs, achieving 70-80% token cost reduction while improving answer quality by letting the model focus on what matters.
3) Search System Highlighting — Add semantic highlighting features to recommendation systems, e-commerce search, or any retrieval system where users need to see why a result is relevant.
Try it out: zilliz/semantic-highlight-bilingual-v1
Read our article: https://huggingface.co/blog/zilliz/zilliz-semantic-highlight-model
Post
3223
HY-MT1.5-1.8B Lightweight Translation Model Open-Source Game-Changer
Tencent raised the bar for lightweight translation!
Supports bidirectional translation across 36 languages total—33 mainstream languages + 5 ethnic/minority dialects
With only 1.8B parameters (less than 1/3 the size of HY-MT1.5-7B), it delivers performance on par with the 7B counterpart and outperforms most commercial translation APIs.
✅ Quantized versions (FP8/GPTQ-Int4) available for edge device deployment, perfect for real-time translation
✅ Full support for terminology intervention, context-aware translation, and formatted output
✅ Ready-to-use prompt templates + seamless integration with Hugging Face Transformers
✅ Recommended transformers ≥ 4.56.0 (FP8 model requires compressed-tensors 0.11.0)
10+ Hugging Face Spaces already integrated this model!
👉 Model Repo: tencent/HY-MT1.5-1.8B
👉 Technical Report: https://arxiv.org/abs/2512.24092
Tencent raised the bar for lightweight translation!
Supports bidirectional translation across 36 languages total—33 mainstream languages + 5 ethnic/minority dialects
With only 1.8B parameters (less than 1/3 the size of HY-MT1.5-7B), it delivers performance on par with the 7B counterpart and outperforms most commercial translation APIs.
✅ Quantized versions (FP8/GPTQ-Int4) available for edge device deployment, perfect for real-time translation
✅ Full support for terminology intervention, context-aware translation, and formatted output
✅ Ready-to-use prompt templates + seamless integration with Hugging Face Transformers
✅ Recommended transformers ≥ 4.56.0 (FP8 model requires compressed-tensors 0.11.0)
10+ Hugging Face Spaces already integrated this model!
👉 Model Repo: tencent/HY-MT1.5-1.8B
👉 Technical Report: https://arxiv.org/abs/2512.24092
posted an update 2 months ago
Post
3223
HY-MT1.5-1.8B Lightweight Translation Model Open-Source Game-Changer
Tencent raised the bar for lightweight translation!
Supports bidirectional translation across 36 languages total—33 mainstream languages + 5 ethnic/minority dialects
With only 1.8B parameters (less than 1/3 the size of HY-MT1.5-7B), it delivers performance on par with the 7B counterpart and outperforms most commercial translation APIs.
✅ Quantized versions (FP8/GPTQ-Int4) available for edge device deployment, perfect for real-time translation
✅ Full support for terminology intervention, context-aware translation, and formatted output
✅ Ready-to-use prompt templates + seamless integration with Hugging Face Transformers
✅ Recommended transformers ≥ 4.56.0 (FP8 model requires compressed-tensors 0.11.0)
10+ Hugging Face Spaces already integrated this model!
👉 Model Repo: tencent/HY-MT1.5-1.8B
👉 Technical Report: https://arxiv.org/abs/2512.24092
Tencent raised the bar for lightweight translation!
Supports bidirectional translation across 36 languages total—33 mainstream languages + 5 ethnic/minority dialects
With only 1.8B parameters (less than 1/3 the size of HY-MT1.5-7B), it delivers performance on par with the 7B counterpart and outperforms most commercial translation APIs.
✅ Quantized versions (FP8/GPTQ-Int4) available for edge device deployment, perfect for real-time translation
✅ Full support for terminology intervention, context-aware translation, and formatted output
✅ Ready-to-use prompt templates + seamless integration with Hugging Face Transformers
✅ Recommended transformers ≥ 4.56.0 (FP8 model requires compressed-tensors 0.11.0)
10+ Hugging Face Spaces already integrated this model!
👉 Model Repo: tencent/HY-MT1.5-1.8B
👉 Technical Report: https://arxiv.org/abs/2512.24092
Post
3171
Qwen-Image-Edit LoRA 96 Camera Angles for 3D-Consistent Image Tweaks
fal/Qwen-Image-Edit-2511-Multiple-Angles-LoRA levels up perspective editing
96 poses (4 elevations × 8 azimuths × 3 distances) – close-ups, wide shots, all angles covered
Trained on 3000+ Gaussian Splatting renders – 3D consistency holds even for -30° low-angle shots
Works with Qwen/Qwen-Image-Edit-2511 base models (LoRA strength 0.8-1.0) + ComfyUI workflow included
Tested it – plug-and-play, no fussy setup.
fal/Qwen-Image-Edit-2511-Multiple-Angles-LoRA
fal/Qwen-Image-Edit-2511-Multiple-Angles-LoRA levels up perspective editing
96 poses (4 elevations × 8 azimuths × 3 distances) – close-ups, wide shots, all angles covered
Trained on 3000+ Gaussian Splatting renders – 3D consistency holds even for -30° low-angle shots
Works with Qwen/Qwen-Image-Edit-2511 base models (LoRA strength 0.8-1.0) + ComfyUI workflow included
Tested it – plug-and-play, no fussy setup.
fal/Qwen-Image-Edit-2511-Multiple-Angles-LoRA
posted an update 2 months ago
Post
3171
Qwen-Image-Edit LoRA 96 Camera Angles for 3D-Consistent Image Tweaks
fal/Qwen-Image-Edit-2511-Multiple-Angles-LoRA levels up perspective editing
96 poses (4 elevations × 8 azimuths × 3 distances) – close-ups, wide shots, all angles covered
Trained on 3000+ Gaussian Splatting renders – 3D consistency holds even for -30° low-angle shots
Works with Qwen/Qwen-Image-Edit-2511 base models (LoRA strength 0.8-1.0) + ComfyUI workflow included
Tested it – plug-and-play, no fussy setup.
fal/Qwen-Image-Edit-2511-Multiple-Angles-LoRA
fal/Qwen-Image-Edit-2511-Multiple-Angles-LoRA levels up perspective editing
96 poses (4 elevations × 8 azimuths × 3 distances) – close-ups, wide shots, all angles covered
Trained on 3000+ Gaussian Splatting renders – 3D consistency holds even for -30° low-angle shots
Works with Qwen/Qwen-Image-Edit-2511 base models (LoRA strength 0.8-1.0) + ComfyUI workflow included
Tested it – plug-and-play, no fussy setup.
fal/Qwen-Image-Edit-2511-Multiple-Angles-LoRA
reactedto sergiopaniego's post with 🔥 3 months ago
Post
517
As the year comes to an end, it’s a good moment to catch up on some of the best long-form pieces published by the @huggingface team.
I’ve gathered them all here if you want to read or save them for later:
https://huggingface.co/collections/sergiopaniego/research-and-long-form-blog-posts
I’ve gathered them all here if you want to read or save them for later:
https://huggingface.co/collections/sergiopaniego/research-and-long-form-blog-posts