"Chat with video" feature is still useless – it always have been | Voters

"Chat with video" feature is still useless – it always have been

Endre

This feature could be groundbreaking (and could have been already, considering I've already reported this issue to you, you simply did nothing – I assume you did not understand how serious this problem is), but a single logical flaw renders it completely unusable, though it's hard to detect.
On the left side of the image, you can see we enter our custom prompt. The goal is to have the LLM analyze the video's content (which is the video's transcript the plugin extracts for the processing) and process it according to my prompt.
Instead (see the right side of the image), THE LLM RECEIVES A SHORT, TIME-STAMPED SUMMARY OF THE VIDEO, which is an EXTREMELY SIMPLIFIED INPUT that has lost a significant amount of content in the compression.
To use a university analogy, IT'S LIKE TRYING TO WRITE A THESIS ON A TOPIC BY ASKING MY CLASSMATE WHAT THEY KNOW ABOUT THE TOPIC AND WRITING THE THESIS BASED ON THEIR WORDS, instead of working from the original source and its full scope.
To use an image processing analogy, it's like feeding a heavily compressed JPEG image to an image recognition function instead of providing the original, uncompressed image.

March 28, 2026

Brian Wood

It’s honestly refreshing to see someone finally call this out because the "chat with video" feature has felt like a massive bottleneck for a while now. I’ve had the exact same frustrating experience where the AI completely misses the context of the visuals or just hallucinates information that isn't even in the footage. It reminds me of the headache I dealt with when trying to integrate some automated video walkthroughs on our own site, https://qualityrenovation.com, a few months back. We wanted a seamless way for clients to interact with our project reels, but the tech kept glitching out or providing generic responses that didn't match the actual craftsmanship on screen.

It feels like these tools are rushed to market before they can actually "see" what’s happening in a video. If the tool can't accurately track a timeline or specific visual details, it really is useless for professional workflows. Hopefully, the Merlin team actually takes this bug report seriously instead of just pushing another half-baked update. We ended up having to go back to manual descriptions for our site just to maintain quality, and it sounds like many others are in the same boat here.