Always wondered why the summaries (based on Captions) are so low quality, like an old Google Translate output, full of wrong wording and grammar defects; cases aren't aligned, articles are inappropriate (in those languages where there are multiple articles).
May we know which model is used for it? Could it be enhanced? Maybe a model selector for us in this function? :)