Machinelearning(@ai_machinelearning_big_data). 📹 Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding LLaMA is wo

📹 Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding LLaMA is working on empowering large language models with video and audio understanding capability. Video-LLaMA - мультимодальная система, которая расширяет возможности больших языковых моделей (LLM) для понимания как визуального, так и аудио контента в видео. 🖥 Github: https://github.com/damo-nlp-sg/video-llama 📕 Paper: https://arxiv.org/abs/2306.02858 ⏩ Demo: huggingface.co/spaces/…eo-LLaMA 📌 Model: modelscope.cn/studios…/summary ai_machinelearning_big_data