Discussion about this post

User's avatar
Xinran Ma's avatar

Didn't realize that there's a mindmap feature!

Expand full comment
Daniel Kearby's avatar

Thanks Wyndo this is eye opening...I have a question though , when you talk about multi modal inputs, for example a YouTube input, it's only extracting the transcript as an input ya not using computer vision on the video? How far away are we from the model actually being able to use it's computer vision to interpret video as an input , rather than just extracting the verbal aspect of a video? Are there already tools for that or are we still a long ways off?

Expand full comment
64 more comments...

No posts