Micko
Chovus
AI & ML interests
None yet
Recent Activity
View all activity
Organizations
None yet
reacted to BlueNipples's post with π 3 days ago
upvoted an article 10 days ago
Article
Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents
β’
45
upvoted a changelog 19 days ago
Hugging Face Changelog
Spaces agents.md for your coding agents
β’ 247
reacted to prithivMLmods's post with β€οΈ 25 days ago
Post
6193
A new comparator on Spaces showcases Standard FLUX.2 Decoder vs. FLUX.2 Small Decoder. The Small Decoder is ~1.4Γ faster, uses ~1.4Γ less VRAM, and maintains near-identical image quality. It has ~28M parameters with narrower channels [96, 192, 384, 384] vs. [128, 256, 512, 512], and the demo supports sequence generation by running both decoders simultaneously and comparing the results side by side.
π€ Comparator: prithivMLmods/Flux.2-4B-Decoder-Comparator
π FLUX.2-small-decoder: black-forest-labs/FLUX.2-small-decoder
π GitHub: https://github.com/PRITHIVSAKTHIUR/Flux.2-4B-Encoder-Comparator
π Collection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection
π€ > App built on the Gradio SDK. To learn more, visit the app page or the respective model pages.
π€ Comparator: prithivMLmods/Flux.2-4B-Decoder-Comparator
π FLUX.2-small-decoder: black-forest-labs/FLUX.2-small-decoder
π GitHub: https://github.com/PRITHIVSAKTHIUR/Flux.2-4B-Encoder-Comparator
π Collection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection
π€ > App built on the Gradio SDK. To learn more, visit the app page or the respective model pages.
Workflow : FLF2V - First-Last Frame & First-Middle-Last Frame
β€οΈπ 9
31
#17 opened 2 months ago
by
RuneXX
reacted to NJX-njx's post with π₯ 2 months ago
Post
7520
Recently, I have open-sourced an AI emotional companion product based on openclaw, called opensoul.
On this platform, you can create a "soulmate" that matches your personality, and configure it with the skills, tools you want it to have, as well as the platforms it can integrate with (such as Telegram, Discord, etc.).
You can even create group chats, invite multiple agents and your friends to chat about recent events, discuss projects together, and so on.
On the one hand, I hope it can better accompany you in daily life by virtue of its unique memory mechanism, self-feedback and iteration mechanism, and the modeling of users' emotions. On the other hand, I also hope it can help you better handle your work with its unique skills, tools and ability to deal with complex task scenarios.
Although the entire product has taken shape, I think there are still many areas that need adjustment and optimization. I also hope to rely on the strength of the community to do a good job in AI emotional companionship.
This is the project introduction URL: https://opensoul-web.vercel.app
This is the GitHub project URL: https://github.com/NJX-njx/opensoul
@AdinaY @lilianweng@burtenshaw@clem
let's just do it
On this platform, you can create a "soulmate" that matches your personality, and configure it with the skills, tools you want it to have, as well as the platforms it can integrate with (such as Telegram, Discord, etc.).
You can even create group chats, invite multiple agents and your friends to chat about recent events, discuss projects together, and so on.
On the one hand, I hope it can better accompany you in daily life by virtue of its unique memory mechanism, self-feedback and iteration mechanism, and the modeling of users' emotions. On the other hand, I also hope it can help you better handle your work with its unique skills, tools and ability to deal with complex task scenarios.
Although the entire product has taken shape, I think there are still many areas that need adjustment and optimization. I also hope to rely on the strength of the community to do a good job in AI emotional companionship.
This is the project introduction URL: https://opensoul-web.vercel.app
This is the GitHub project URL: https://github.com/NJX-njx/opensoul
@AdinaY @lilianweng@burtenshaw@clem
let's just do it
upvoted an article 3 months ago
Article
GGML and llama.cpp join HF to ensure the long-term progress of Local AI


- +4
β’
505
reacted to marksverdhei's post with π 3 months ago
Post
4595
Poll: Will 2026 be the year of subquadratic attention?
The transformer architecture is cursed by its computational complexity.
It is why you run out of tokens and have to compact. But some would argue that this is a feature not a bug and that this is also why these models are so good. We've been doing a lot of research on trying to make equally good models that are computationally cheaper, But so far, none of the approaches have stood the test of time. Or so it seems.
Please vote, don't be shy. Remember that the Dunning-Kruger effect is very real, so the person who knows less about transformers than you is going to vote. We want everyone's opinion, no matter confidence.
π if you think at least one frontier model* will have no O(n^2) attention by the end of 2026
π₯ If you disagree
* Frontier models - models that match / outperform the flagship claude, gemini or chatgpt at the time on multiple popular benchmarks
The transformer architecture is cursed by its computational complexity.
It is why you run out of tokens and have to compact. But some would argue that this is a feature not a bug and that this is also why these models are so good. We've been doing a lot of research on trying to make equally good models that are computationally cheaper, But so far, none of the approaches have stood the test of time. Or so it seems.
Please vote, don't be shy. Remember that the Dunning-Kruger effect is very real, so the person who knows less about transformers than you is going to vote. We want everyone's opinion, no matter confidence.
π if you think at least one frontier model* will have no O(n^2) attention by the end of 2026
π₯ If you disagree
* Frontier models - models that match / outperform the flagship claude, gemini or chatgpt at the time on multiple popular benchmarks
reacted to mitkox's post with π 3 months ago
Post
4819
I just pushed Claude Code Agent Swarm with 20 coding agents on my desktop GPU workstation.
With local AI, I donβt have /fast CC switch, but I have /absurdlyfast:
- 100β499 tokens/second read, yeah 100k, not a typo | 811 tok/sec generation
- KV cache: 707β200 tokens
- Hardware: 5+ year old GPUs 4xA6K gen1; Itβs not the car. Itβs the driver.
Qwen3 Coder Next AWQ with cache at BF16. Scores 82.1% in C# on 29-years-in-dev codebase vs Opus 4.5 at only 57.5%. When your codebase predates Stack Overflow, you don't need the biggest model; you need the one that actually remembers Windows 95.
My current bottleneck is my 27" monitor. Can't fit all 20 Theos on screen without squinting.
With local AI, I donβt have /fast CC switch, but I have /absurdlyfast:
- 100β499 tokens/second read, yeah 100k, not a typo | 811 tok/sec generation
- KV cache: 707β200 tokens
- Hardware: 5+ year old GPUs 4xA6K gen1; Itβs not the car. Itβs the driver.
Qwen3 Coder Next AWQ with cache at BF16. Scores 82.1% in C# on 29-years-in-dev codebase vs Opus 4.5 at only 57.5%. When your codebase predates Stack Overflow, you don't need the biggest model; you need the one that actually remembers Windows 95.
My current bottleneck is my 27" monitor. Can't fit all 20 Theos on screen without squinting.
reacted to DavidAU's post with π₯ 4 months ago
Post
9783
SAVANT COMMANDER: 48B-A4B , 256k Context, GATED MOE.
I am going to showcase some other people's tuning work, that I have put into a GATED Distill MOE (Qwen3) ; 256 K context. Special thanks to all the tuners (listed in the model tree and repo page with special shoutout to "TeichAI" - using Unsloth for a lot of the Distills in this model):
Savant Commander is a specialized MOE model that allows you to control which expert(s) (of 12) are assigned to your use case(s) / prompt(s) ... directly (by name(s)), as opposed to having the "choices" made for you.
The model is composed of 12 DISTILLS (compressed 12x4B MOE) of top closed (GPT5.1, OpenAI 120 GPT Oss, Gemini (3), Claude (2) ) and open source models (Kimi, GLM, Deepseek, Command-A, JanV1 ) all in one.
256k Context, 2 experts activated.
PS: There is also a "heretic" / "decensored" version too ; listed on this model page.
DavidAU/Qwen3-48B-A4B-Savant-Commander-GATED-12x-Closed-Open-Source-Distill-GGUF
I am going to showcase some other people's tuning work, that I have put into a GATED Distill MOE (Qwen3) ; 256 K context. Special thanks to all the tuners (listed in the model tree and repo page with special shoutout to "TeichAI" - using Unsloth for a lot of the Distills in this model):
Savant Commander is a specialized MOE model that allows you to control which expert(s) (of 12) are assigned to your use case(s) / prompt(s) ... directly (by name(s)), as opposed to having the "choices" made for you.
The model is composed of 12 DISTILLS (compressed 12x4B MOE) of top closed (GPT5.1, OpenAI 120 GPT Oss, Gemini (3), Claude (2) ) and open source models (Kimi, GLM, Deepseek, Command-A, JanV1 ) all in one.
256k Context, 2 experts activated.
PS: There is also a "heretic" / "decensored" version too ; listed on this model page.
DavidAU/Qwen3-48B-A4B-Savant-Commander-GATED-12x-Closed-Open-Source-Distill-GGUF
upvoted an article 5 months ago
Article
Transformers v5: Simple model definitions powering the AI ecosystem


- +2
β’
310