mPLUG-Octopus: The Versatile Assistant Empowered by A Modularized End-to-End Multimodal LLM

MM '23: Proceedings of the 31st ACM International Conference on Multimedia(2023)

引用 0|浏览30
Inspired by the recent developments of large language models (LLMs), we propose mPLUG-Octopus, a versatile conversational assistant designed to provide users with coherent, engaging, and helpful interaction experiences in both text-only and multi-modal scenarios. Unlike traditional pipeline chatting systems, mPLUG-Octopus offers a diverse range of creative capabilities including open-domain QA, multi-turn chatting, and multi-modal creation, all built with a unified multimodal LLM without relying on any external API. With the modularized end-to-end multimodal LLM technology, mPLUG-Octopus efficiently facilitates engaging and open-domain conversation experience. It exhibits a wide range of uni/multi-modal elemental capabilities, enabling it to seamlessly communicate with users on open-domain topics and engage in multi-turn conversations. It also assists users in accomplishing various content creation and application tasks. Our conversational assistant can also be deployed on smart hardware to drive advanced AIGC applications.
AI 理解论文
Chat Paper