Voice Funtion like chatgpt
in progress
endu
Merged in a post:
Real-time conversation
C
CS
To be able to start a conversation through the Merlin app or through the desktop site as Chatgpt does.
V
Vijay Bharadwaj
Merged in a post:
Ability for voice input on Web
P
Philippos Christoforou
so we can talk instead of typing
M
Mowd Chen
It seems realtime voice conversation has been remove from the latest update in iOS version 5.4.3.
M
Marlo
Voice input and output is a must in these days on mobile and also desktop, since ChatGPT and Gemini has is, I tend to use them, instead of Merlin and I am thinking about stopping the Merlin subscription.
It is way better and quick to interact with ai through voice
endu
Merged in a post:
Voice and music generation - e.g., ElevenLabs (voice)
G
Guy
Any chance of getting voice generation and music generation features, like image generation?
The best I know for voice generation being https://elevenlabs.io/
Merlin
in progress
cc: Ethan Cohen
Work is underway currently for the Merlin mobile app only.
Realtime API has high costs due to which implementation for prolonged use time is hindered. We're looking to keep it mobile only FOR NOW since the voice-talking experience is much better suited by design for hand-held devices (the way we envision it), and anything at the scale of our desktop apps would be unsustainable.
We're also figuring out a way to give users the ability to chat with voice with all text LLMs we offer on Merlin Chat. The quality of the experience is the bottleneck so far. Thanks!
Y
YM
Merlin . It makes me wonder if a lightweight opensource local llm is the answer for such a use case
ELVETH
Merlin the ability to ask thing (by voice) to Merlin while walking/driving is a must, today. What’s the release date/roadmap expectation?
Ethan Cohen
Hey team - do we have a timeline to release on this one for desktop? Feel like we're getting a lot of likes but haven't seen any movement. endu
G
Guy
endu Another thing for me is I would really like to be able to talk out e-mails, and then have it take that audio and make a clearer e-mail.
Or the same thing for notes/ideas - I speak out the note/idea, and it then make it clearer and organised.
Similar to what Letterly does - https://letterly.app/
Kira Kenjiro
I think Endu is right. Eleven labs is quite an expensive platform so it would 100% have to be behind premium or limited to tokens per month.
It's a cool thing to have 100%, I'd love to have eleven labs integration into merlin but at the moment at their current standpoint I don't think it's the right time for something like this. The best way to do it would be to use a alternative TTS api like the TTS-1 model by open ai and using SUNO for music generation. Other than that it's just outside merlin's ability at the moment
endu
Kira Kenjiro: hi & thanks for echoing(seconding.. :P) my views :)
endu
no Guy we are not going in that direction as of now; also the elevenlabs APIs are quite expensive so it can be a pro only feature!
G
Guy
endu So can it be left open for feedback and make it a pro feature?
endu
Guy ok cool that can be done keeping the ticket open then!
if we get in more pro buy ins maybe we can build it as well :P
G
Guy
endu Great!
Load More
→