Visual ChatGPT integrates ChatGPT with several Visual Foundation Models, allowing users to exchange images while conversing. This system combines the extensive and varied understanding offered by ChatGPT, which acts as a general interface, with the specialized expertise of Foundation Models that deliver in-depth knowledge in particular areas. By merging these two forms of intelligence, we aspire to create an AI that can adeptly manage a multitude of tasks across different fields. Ultimately, this synergy aims to enhance user experience and broaden the capabilities of conversational AI.