{"id":51899,"date":"2024-12-12T10:00:18","date_gmt":"2024-12-12T18:00:18","guid":{"rendered":"https:\/\/www.edge-ai-vision.com\/?page_id=51899"},"modified":"2024-12-12T12:43:33","modified_gmt":"2024-12-12T20:43:33","slug":"multimodal-large-language-models","status":"publish","type":"page","link":"https:\/\/www.edge-ai-vision.com\/resources\/multimodal-large-language-models\/","title":{"rendered":"Multimodal Large Language Models"},"content":{"rendered":"<p>LLMs and MLLMs The past decade-plus has seen incredible progress in practical computer vision. Thanks to deep learning, computer vision is dramatically more robust and accessible, and has enabled compelling capabilities in thousands of applications, from automotive safety to healthcare. But today\u2019s widely used deep learning techniques suffer from serious&hellip;<\/p>\n<p><span class=\"wppb-frontend-restriction-message wppb-content-restriction-message\">&#8220;A View From the 2025 Embedded Vision Summit (Part 2),&#8221; a Presentation from the Edge AI and Vision Alliance<\/p>\n<p><strong><a href=\"\/register\/\">Register<\/a> or sign in to access this content.<\/strong><\/p>\n<p>Registration is free and takes less than one minute. <a href=\"\/register\/\">Click here<\/a> to register and get full access to the Edge AI and Vision Alliance's valuable content.<\/p>\n<p><div id=\"wppb-login-wrap\" class=\"wppb-user-forms\">\n\t\t<form name=\"wppb-loginform\" id=\"wppb-loginform\" class=\"\" action=\"https:\/\/www.edge-ai-vision.com\/wp-json\/wp\/v2\/pages\/51899\" method=\"post\">\n\t\t\t<div class=\"wp-success\"><\/div>\n\t\t\t<p class=\"wppb-form-field login-username\">\n\t\t\t\t<label for=\"wppb_user_login\">Username or Email<\/label>\n\t\t\t\t<input type=\"text\" name=\"log\" id=\"wppb_user_login\" class=\"input\" value=\"\" size=\"20\" \/>\n\t\t\t<\/p>\n\t\t\t<p class=\"wppb-form-field login-password\">\n\t\t\t\t<label for=\"wppb_user_pass\">Password<\/label>\n\t\t\t\t<span class=\"wppb-password-field-container\">\n\t\t\t\t    <input type=\"password\" name=\"pwd\" id=\"wppb_user_pass\" class=\"input\" value=\"\" size=\"20\" \/>\n\t\t\t\t     <!-- add the HTML for the visibility toggle -->\n\t\t\t\t<\/span>\n            <\/p>\n\t\t\t\n\t\t\t<div class=\"wppb-form-field wppb-recaptcha wppb-recaptcha-v3\"><div id=\"wppb-recaptcha-element-pb_login10\" class=\"wppb-recaptcha-element wppb-v3-recaptcha\"><input type=\"hidden\" name=\"g-recaptcha-response\" class=\"g-recaptcha-response wppb-v3-recaptcha\"><\/div><input type=\"hidden\" name=\"wppb-recaptcha-v3\" value=\"1\"><\/div>\n\t\t\t<p class=\"wppb-form-field login-remember\"><input name=\"rememberme\" type=\"checkbox\" id=\"rememberme\" value=\"forever\" \/><label for=\"rememberme\">Remember Me<\/label><\/p>\n\t\t\t<p class=\"login-submit form-submit\">\n\t\t\t\t<input type=\"submit\" name=\"wp-submit\" id=\"wppb-submit\" class=\"button button-primary\" value=\"Log In\" disabled=\"disabled\"\" \/>\n\t\t\t\t<input type=\"hidden\" name=\"redirect_to\" value=\"https:\/\/www.edge-ai-vision.com\/wp-json\/wp\/v2\/pages\/51899\" \/>\n\t\t\t<\/p>\n\t\t\t<input type=\"hidden\" name=\"wppb_login\" value=\"true\"\/>\n\t\t\t<input type=\"hidden\" name=\"wppb_form_location\" value=\"widget\"\/>\n\t\t\t<input type=\"hidden\" name=\"wppb_request_url\" value=\"https:\/\/www.edge-ai-vision.com\/wp-json\/wp\/v2\/pages\/51899\"\/>\n\t\t\t<input type=\"hidden\" name=\"wppb_lostpassword_url\" value=\"\"\/>\n\t\t\t<input type=\"hidden\" name=\"wppb_redirect_priority\" value=\"\"\/>\n\t\t\t<input type=\"hidden\" name=\"wppb_referer_url\" value=\"http:\/\/www.edge-ai-vision.com\/resources\/multimodal-large-language-models\/\"\/>\n\t\t\t<input type=\"hidden\" id=\"CSRFToken-wppb\" name=\"CSRFToken-wppb\" value=\"b3328dc5a1\" \/><input type=\"hidden\" name=\"_wp_http_referer\" value=\"\/wp-json\/wp\/v2\/pages\/51899\" \/>\n\t\t\t<input type=\"hidden\" name=\"wppb_redirect_check\" value=\"true\"\/>\n\t\t\t\n\t\t<\/form><\/div><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>LLMs and MLLMs The past decade-plus has seen incredible progress in practical computer vision. Thanks to deep learning, computer vision is dramatically more robust and accessible, and has enabled compelling capabilities in thousands of applications, from automotive safety to healthcare. But today\u2019s widely used deep learning techniques suffer from serious&hellip; &#8220;A View From the 2025 [&hellip;]<\/p>\n","protected":false},"author":9,"featured_media":0,"parent":23948,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"content-type":"","_uag_custom_page_level_css":"","site-sidebar-layout":"right-sidebar","site-content-layout":"plain-container","ast-site-content-layout":"normal-width-container","site-content-style":"unboxed","site-sidebar-style":"unboxed","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"disabled","ast-breadcrumbs-content":"","ast-featured-img":"disabled","footer-sml-layout":"","ast-disable-related-posts":"","theme-transparent-header-meta":"default","adv-header-id-meta":"","stick-header-meta":"default","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"set","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"summityear":[],"class_list":["post-51899","page","type-page","status-publish","hentry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.8 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Multimodal Large Language Models - Edge AI and Vision Alliance<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.edge-ai-vision.com\/resources\/multimodal-large-language-models\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Multimodal Large Language Models - Edge AI and Vision Alliance\" \/>\n<meta property=\"og:description\" content=\"LLMs and MLLMs The past decade-plus has seen incredible progress in practical computer vision. Thanks to deep learning, computer vision is dramatically more robust and accessible, and has enabled compelling capabilities in thousands of applications, from automotive safety to healthcare. But today\u2019s widely used deep learning techniques suffer from serious&hellip; &#8220;A View From the 2025 [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.edge-ai-vision.com\/resources\/multimodal-large-language-models\/\" \/>\n<meta property=\"og:site_name\" content=\"Edge AI and Vision Alliance\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/EdgeAIVision\/\" \/>\n<meta property=\"article:modified_time\" content=\"2024-12-12T20:43:33+00:00\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:site\" content=\"@edgeaivision\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"14 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.edge-ai-vision.com\/resources\/multimodal-large-language-models\/\",\"url\":\"https:\/\/www.edge-ai-vision.com\/resources\/multimodal-large-language-models\/\",\"name\":\"Multimodal Large Language Models - Edge AI and Vision Alliance\",\"isPartOf\":{\"@id\":\"https:\/\/www.edge-ai-vision.com\/#website\"},\"datePublished\":\"2024-12-12T18:00:18+00:00\",\"dateModified\":\"2024-12-12T20:43:33+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/www.edge-ai-vision.com\/resources\/multimodal-large-language-models\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.edge-ai-vision.com\/resources\/multimodal-large-language-models\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.edge-ai-vision.com\/resources\/multimodal-large-language-models\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.edge-ai-vision.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Resources\",\"item\":\"https:\/\/www.edge-ai-vision.com\/resources\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Multimodal Large Language Models\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.edge-ai-vision.com\/#website\",\"url\":\"https:\/\/www.edge-ai-vision.com\/\",\"name\":\"Edge AI and Vision Alliance\",\"description\":\"Designing machines that perceive and understand.\",\"publisher\":{\"@id\":\"https:\/\/www.edge-ai-vision.com\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.edge-ai-vision.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.edge-ai-vision.com\/#organization\",\"name\":\"Edge AI and Vision Alliance\",\"url\":\"https:\/\/www.edge-ai-vision.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.edge-ai-vision.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.edge-ai-vision.com\/wp-content\/uploads\/2020\/01\/1200x675header_edgeai_vision.jpg\",\"contentUrl\":\"https:\/\/www.edge-ai-vision.com\/wp-content\/uploads\/2020\/01\/1200x675header_edgeai_vision.jpg\",\"width\":1200,\"height\":675,\"caption\":\"Edge AI and Vision Alliance\"},\"image\":{\"@id\":\"https:\/\/www.edge-ai-vision.com\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/EdgeAIVision\/\",\"https:\/\/x.com\/edgeaivision\",\"https:\/\/www.linkedin.com\/company\/edgeaivision\/\",\"http:\/\/www.youtube.com\/embeddedvision\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Multimodal Large Language Models - Edge AI and Vision Alliance","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.edge-ai-vision.com\/resources\/multimodal-large-language-models\/","og_locale":"en_US","og_type":"article","og_title":"Multimodal Large Language Models - Edge AI and Vision Alliance","og_description":"LLMs and MLLMs The past decade-plus has seen incredible progress in practical computer vision. Thanks to deep learning, computer vision is dramatically more robust and accessible, and has enabled compelling capabilities in thousands of applications, from automotive safety to healthcare. But today\u2019s widely used deep learning techniques suffer from serious&hellip; &#8220;A View From the 2025 [&hellip;]","og_url":"https:\/\/www.edge-ai-vision.com\/resources\/multimodal-large-language-models\/","og_site_name":"Edge AI and Vision Alliance","article_publisher":"https:\/\/www.facebook.com\/EdgeAIVision\/","article_modified_time":"2024-12-12T20:43:33+00:00","twitter_card":"summary_large_image","twitter_site":"@edgeaivision","twitter_misc":{"Est. reading time":"14 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.edge-ai-vision.com\/resources\/multimodal-large-language-models\/","url":"https:\/\/www.edge-ai-vision.com\/resources\/multimodal-large-language-models\/","name":"Multimodal Large Language Models - Edge AI and Vision Alliance","isPartOf":{"@id":"https:\/\/www.edge-ai-vision.com\/#website"},"datePublished":"2024-12-12T18:00:18+00:00","dateModified":"2024-12-12T20:43:33+00:00","breadcrumb":{"@id":"https:\/\/www.edge-ai-vision.com\/resources\/multimodal-large-language-models\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.edge-ai-vision.com\/resources\/multimodal-large-language-models\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.edge-ai-vision.com\/resources\/multimodal-large-language-models\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.edge-ai-vision.com\/"},{"@type":"ListItem","position":2,"name":"Resources","item":"https:\/\/www.edge-ai-vision.com\/resources\/"},{"@type":"ListItem","position":3,"name":"Multimodal Large Language Models"}]},{"@type":"WebSite","@id":"https:\/\/www.edge-ai-vision.com\/#website","url":"https:\/\/www.edge-ai-vision.com\/","name":"Edge AI and Vision Alliance","description":"Designing machines that perceive and understand.","publisher":{"@id":"https:\/\/www.edge-ai-vision.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.edge-ai-vision.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.edge-ai-vision.com\/#organization","name":"Edge AI and Vision Alliance","url":"https:\/\/www.edge-ai-vision.com\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.edge-ai-vision.com\/#\/schema\/logo\/image\/","url":"https:\/\/www.edge-ai-vision.com\/wp-content\/uploads\/2020\/01\/1200x675header_edgeai_vision.jpg","contentUrl":"https:\/\/www.edge-ai-vision.com\/wp-content\/uploads\/2020\/01\/1200x675header_edgeai_vision.jpg","width":1200,"height":675,"caption":"Edge AI and Vision Alliance"},"image":{"@id":"https:\/\/www.edge-ai-vision.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/EdgeAIVision\/","https:\/\/x.com\/edgeaivision","https:\/\/www.linkedin.com\/company\/edgeaivision\/","http:\/\/www.youtube.com\/embeddedvision"]}]}},"uagb_featured_image_src":{"full":false,"thumbnail":false,"medium":false,"medium_large":false,"large":false,"1536x1536":false,"2048x2048":false},"uagb_author_info":{"display_name":"Brian Dipert","author_link":"https:\/\/www.edge-ai-vision.com\/author\/brian-dipert\/"},"uagb_comment_info":0,"uagb_excerpt":"LLMs and MLLMs The past decade-plus has seen incredible progress in practical computer vision. Thanks to deep learning, computer vision is dramatically more robust and accessible, and has enabled compelling capabilities in thousands of applications, from automotive safety to healthcare. But today\u2019s widely used deep learning techniques suffer from serious&hellip; &#8220;A View From the 2025&hellip;","_links":{"self":[{"href":"https:\/\/www.edge-ai-vision.com\/wp-json\/wp\/v2\/pages\/51899","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.edge-ai-vision.com\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.edge-ai-vision.com\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.edge-ai-vision.com\/wp-json\/wp\/v2\/users\/9"}],"replies":[{"embeddable":true,"href":"https:\/\/www.edge-ai-vision.com\/wp-json\/wp\/v2\/comments?post=51899"}],"version-history":[{"count":22,"href":"https:\/\/www.edge-ai-vision.com\/wp-json\/wp\/v2\/pages\/51899\/revisions"}],"predecessor-version":[{"id":51927,"href":"https:\/\/www.edge-ai-vision.com\/wp-json\/wp\/v2\/pages\/51899\/revisions\/51927"}],"up":[{"embeddable":true,"href":"https:\/\/www.edge-ai-vision.com\/wp-json\/wp\/v2\/pages\/23948"}],"wp:attachment":[{"href":"https:\/\/www.edge-ai-vision.com\/wp-json\/wp\/v2\/media?parent=51899"}],"wp:term":[{"taxonomy":"summityear","embeddable":true,"href":"https:\/\/www.edge-ai-vision.com\/wp-json\/wp\/v2\/summityear?post=51899"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}