wangbuer999 (jiayanmu)

repliedto their post about 2 months ago

This comment has been hidden

repliedto their post 2 months ago

Ran it locally, with the environment，diffusers==0.26.3 + CUDA 12.1. The weights are in the link I posted just click in and you can get them

reactedto their post with 😎🔥 2 months ago

Post

2647

HunyuanImage 3.0-Instruct just dropped

fresh -sourceImage 3.0model! Spent 20 mins testing it on a Messi + retro scrambler fusion case

Ran on diffusers v0.26.3 + CUDA 12.1 | 8B MoE params (1.3B activated) | zero VRAM issues

strength=0.9 Messi #10 kit/tattoo sharp, moto’s rusted metal texture blurred (classic open-source pain)
strength=0.7 Moto/cobblestone background crisp, Messi’s jersey details faded completely

strength=0.75 + prompt "Blend seamlessly, keep all original details": both subject & background sharp
No ControlNet, no manual masking the model’s chain-of-thought reasoning parses image+prompt first
Already outperforms Qwen-Image-Edit 2511 (GSB eval +25.7% on single-image edits) | 100% open-source

👉 Repo: https://hunyuan.tencent.com/chat/HunyuanDefault?from=modelSquare&modelId=Hunyuan-Image-3.0-Instruct

technical report：https://arxiv.org/abs/2509.23951

Anyone else struggled with strength tweaks for fusion? This fixed it for my Messi+moto case did it work as well for yours?

6 replies

·

reactedto Benedictat's post with 🔥 2 months ago

Post

2762

Tencent HunyuanImage 3.0-Instruct is seriously impressive

skyrocketed to 2nd place globally on the LMArena leaderboard, only trailing Google Nano-banana Pro.

What excites me most is its newly launched image editing and multi-image fusion capabilities

its semantic understanding is rock-solid this Instruct-following capability basically enables one-sentence end-to-end workflows, delivering a dimensionality-reducing boost in efficiency.

Frankly, it nails the pain points of frontline creators: old photo restoration, text modification, even extracting people from multiple images to create group shots. Previously, tweaking the fusion quality took tons of effort, but now the out-of-the-box realism and emotional expression are top-tier zero cheap AI artifacts

👉 Repo: https://hunyuan.tencent.com/chat/HunyuanDefault?from=modelSquare&modelId=Hunyuan-Image-3.0-Instruct

technical report：https://arxiv.org/abs/2509.23951

2 replies

·

reactedto kelsend's post with 🔥 2 months ago

Post

2642

I'm absolutely stunned by the aesthetics of HunyuanImage-3.0
The visual effects of this model are simply beyond imagination it’s every bit as good as NanoBanana, no compromise at all.
I fine-tuned my micro-scene prompts by adding text overlays and background effects, and its adaptability is truly breathtaking. With just one prompt, you can generate scene posters for any movie or novel.
Every detail, from scene design to text style and atmospheric effects, perfectly aligns with the tone of the original material.
No forced elements, just seamless, film-grade visual effects that exactly match what I envisioned.

👉 Repo: https://hunyuan.tencent.com/chat/HunyuanDefault?from=modelSquare&modelId=Hunyuan-Image-3.0-Instruct

3 replies

·

posted an update 2 months ago

Post

2647

HunyuanImage 3.0-Instruct just dropped

fresh -sourceImage 3.0model! Spent 20 mins testing it on a Messi + retro scrambler fusion case

Ran on diffusers v0.26.3 + CUDA 12.1 | 8B MoE params (1.3B activated) | zero VRAM issues

strength=0.9 Messi #10 kit/tattoo sharp, moto’s rusted metal texture blurred (classic open-source pain)
strength=0.7 Moto/cobblestone background crisp, Messi’s jersey details faded completely

strength=0.75 + prompt "Blend seamlessly, keep all original details": both subject & background sharp
No ControlNet, no manual masking the model’s chain-of-thought reasoning parses image+prompt first
Already outperforms Qwen-Image-Edit 2511 (GSB eval +25.7% on single-image edits) | 100% open-source

👉 Repo: https://hunyuan.tencent.com/chat/HunyuanDefault?from=modelSquare&modelId=Hunyuan-Image-3.0-Instruct

technical report：https://arxiv.org/abs/2509.23951

Anyone else struggled with strength tweaks for fusion? This fixed it for my Messi+moto case did it work as well for yours?

6 replies

·

reactedto zc277584121's post with 🔥 2 months ago

Post

3166

We've open-sourced a bilingual Semantic Highlighting model that can power multiple production scenarios:

1) RAG Answer Highlighting — Automatically highlight the exact sentences that answer user queries, improving interpretability and helping users quickly locate relevant information.
2) RAG Noise Filtering — Prune irrelevant context before sending to LLMs, achieving 70-80% token cost reduction while improving answer quality by letting the model focus on what matters.
3) Search System Highlighting — Add semantic highlighting features to recommendation systems, e-commerce search, or any retrieval system where users need to see why a result is relevant.

Try it out: zilliz/semantic-highlight-bilingual-v1
Read our article: https://huggingface.co/blog/zilliz/zilliz-semantic-highlight-model

reactedto their post with 🚀🔥 2 months ago

Post

3223

HY-MT1.5-1.8B Lightweight Translation Model Open-Source Game-Changer

Tencent raised the bar for lightweight translation!

Supports bidirectional translation across 36 languages total—33 mainstream languages + 5 ethnic/minority dialects

With only 1.8B parameters (less than 1/3 the size of HY-MT1.5-7B), it delivers performance on par with the 7B counterpart and outperforms most commercial translation APIs.

✅ Quantized versions (FP8/GPTQ-Int4) available for edge device deployment, perfect for real-time translation
✅ Full support for terminology intervention, context-aware translation, and formatted output
✅ Ready-to-use prompt templates + seamless integration with Hugging Face Transformers
✅ Recommended transformers ≥ 4.56.0 (FP8 model requires compressed-tensors 0.11.0)

10+ Hugging Face Spaces already integrated this model!

👉 Model Repo: tencent/HY-MT1.5-1.8B
👉 Technical Report: https://arxiv.org/abs/2512.24092

posted an update 2 months ago

Post

3223

HY-MT1.5-1.8B Lightweight Translation Model Open-Source Game-Changer

Tencent raised the bar for lightweight translation!

Supports bidirectional translation across 36 languages total—33 mainstream languages + 5 ethnic/minority dialects

With only 1.8B parameters (less than 1/3 the size of HY-MT1.5-7B), it delivers performance on par with the 7B counterpart and outperforms most commercial translation APIs.

✅ Quantized versions (FP8/GPTQ-Int4) available for edge device deployment, perfect for real-time translation
✅ Full support for terminology intervention, context-aware translation, and formatted output
✅ Ready-to-use prompt templates + seamless integration with Hugging Face Transformers
✅ Recommended transformers ≥ 4.56.0 (FP8 model requires compressed-tensors 0.11.0)

10+ Hugging Face Spaces already integrated this model!

👉 Model Repo: tencent/HY-MT1.5-1.8B
👉 Technical Report: https://arxiv.org/abs/2512.24092

reactedto their post with 🚀🔥 2 months ago

Post

3171

Qwen-Image-Edit LoRA 96 Camera Angles for 3D-Consistent Image Tweaks

fal/Qwen-Image-Edit-2511-Multiple-Angles-LoRA levels up perspective editing

96 poses (4 elevations × 8 azimuths × 3 distances) – close-ups, wide shots, all angles covered

Trained on 3000+ Gaussian Splatting renders – 3D consistency holds even for -30° low-angle shots

Works with Qwen/Qwen-Image-Edit-2511 base models (LoRA strength 0.8-1.0) + ComfyUI workflow included
Tested it – plug-and-play, no fussy setup.

fal/Qwen-Image-Edit-2511-Multiple-Angles-LoRA

posted an update 2 months ago

Post

3171

Qwen-Image-Edit LoRA 96 Camera Angles for 3D-Consistent Image Tweaks

fal/Qwen-Image-Edit-2511-Multiple-Angles-LoRA levels up perspective editing

96 poses (4 elevations × 8 azimuths × 3 distances) – close-ups, wide shots, all angles covered

Trained on 3000+ Gaussian Splatting renders – 3D consistency holds even for -30° low-angle shots

Works with Qwen/Qwen-Image-Edit-2511 base models (LoRA strength 0.8-1.0) + ComfyUI workflow included
Tested it – plug-and-play, no fussy setup.

fal/Qwen-Image-Edit-2511-Multiple-Angles-LoRA

reactedto sergiopaniego's post with 🔥 3 months ago

Post

517

As the year comes to an end, it’s a good moment to catch up on some of the best long-form pieces published by the @huggingface team.

I’ve gathered them all here if you want to read or save them for later:
https://huggingface.co/collections/sergiopaniego/research-and-long-form-blog-posts

jiayanmu

AI & ML interests

Organizations

jiayanmu

AI & ML interests

Organizations

wangbuer999's activity