Combine models for improve reasoning
under review
E
Endre
Here's a tweet about DeepClaude: https://x.com/Saboo_Shubham_/status/1885167873615945893
It combines Claude Sonnet 3.5 with DeepSeek R1 CoT reasoning to outperform OpenAI o1, DeepSeek R1, Claude Sonnet 3.5. (After we set our Anthropic and DeepSeek API keys.)
You could practically do the same, once R1 model arrives to Merlin. And I bet you could cook it over. 😏
S
S K
How the heck does that even work? Does it bounce responses between the bots in some kind of coordinated conversation before responding to the user?
E
Endre
S K I guess it uses these models as agents, so yes, it processes one's output with the other. No other idea. :)
a
alan
I saw this on Aider's website with a benchmark showing that the combination beats both Claude and DeepSeek R1 independently for the dream team of coding
endu
under review
endu
Thanks a ton for sharing this.. We’ve also been thinking about using multiple models for different purposes, either simultaneously or in steps. This approach looks super interesting, and we’ll definitely check it out..
E
Endre
endu Though I'm not sure it truly should be done this literally with DeepSeek. Maybe rather o3-mini?
S
Siddhartha
Endre: Our deepseek is via fireworks AI. So its hosted in the US.
E
Endre
Siddhartha Ohh, that's cool!