The post Vishal Misra: Transformers learn correlations, not causations, the significance of in-context learning, and the role of Bayesian updating in AI appearedThe post Vishal Misra: Transformers learn correlations, not causations, the significance of in-context learning, and the role of Bayesian updating in AI appeared

Vishal Misra: Transformers learn correlations, not causations, the significance of in-context learning, and the role of Bayesian updating in AI

2026/04/11 10:00
11분 읽기
이 콘텐츠에 대한 의견이나 우려 사항이 있으시면 crypto.news@mexc.com으로 연락주시기 바랍니다


Understanding transformers’ limitations reveals the crucial shift needed from correlation to causation for true AI advancement.

Key Takeaways

  • Transformers primarily learn correlations, not causations, limiting their ability to achieve true intelligence.
  • Achieving AGI requires models that can transition from learning correlations to understanding causations.
  • Large language models generate text by predicting the next token based on probability distributions.
  • The context provided in prompts significantly influences the output of language models.
  • Language models operate on sparse matrices where many token combinations are nonsensical.
  • In-context learning allows LLMs to solve problems in real-time using examples.
  • Domain-specific languages (DSLs) can simplify complex database queries into natural language.
  • In-context learning in LLMs is similar to Bayesian updating, adjusting probabilities with new evidence.
  • The debate between Bayesian and frequentist approaches affects the perception of new machine learning models.
  • The Bayesian wind tunnel concept offers a controlled environment for testing machine learning architectures.
  • Understanding the mechanics of LLMs is crucial for leveraging their applications effectively.
  • The transition from correlation to causation is a significant hurdle in AI development.
  • Contextual relevance in LLMs highlights the importance of prompt selection.
  • Sparse matrices in language models enhance efficiency by filtering out irrelevant token combinations.
  • The Bayesian wind tunnel provides a novel framework for evaluating machine learning models.

Guest intro

Vishal Misra is Professor of Computer Science and Electrical Engineering and Vice Dean of Computing and AI at Columbia University’s School of Engineering. He returns to the a16z Podcast to discuss his latest research revealing how transformers in LLMs update predictions in a precise, mathematically predictable manner as they process new information. His work highlights the gap to AGI, emphasizing the need for continuous post-training learning and causal understanding over pattern matching.

Understanding transformers and LLMs

  • — Vishal Misra

  • LLMs primarily learn correlations rather than causations, which limits their intelligence.
  • — Vishal Misra

  • Achieving AGI requires models that can learn causations, not just correlations.
  • — Vishal Misra

  • LLMs generate text by constructing a probability distribution for the next token.
  • — Vishal Misra

  • Understanding the mechanics of LLMs is crucial for leveraging their applications effectively.

The role of context in language models

  • The behavior of language models is influenced by the prior context provided in prompts.
  • — Vishal Misra

  • Contextual relevance in LLMs highlights the importance of prompt selection.
  • Language models operate on a sparse matrix where many combinations of tokens are nonsensical.
  • — Vishal Misra

  • Sparse matrices enhance efficiency by filtering out irrelevant token combinations.
  • The context provided can drastically change the output of language models.
  • Understanding how language models generate text based on input prompts is essential.

In-context learning and real-time problem solving

  • In-context learning allows LLMs to learn and solve problems in real-time.
  • — Vishal Misra

  • LLMs process and learn from new information through examples.
  • In-context learning resembles Bayesian updating, adjusting probabilities with new evidence.
  • — Vishal Misra

  • This mechanism is crucial for understanding the capabilities of LLMs.
  • Real-time problem solving in LLMs is enabled by in-context learning.
  • The ability to learn from examples showcases the adaptability of LLMs.

Domain-specific languages and data accessibility

  • Domain-specific languages (DSLs) convert natural language queries into a processable format.
  • — Vishal Misra

  • DSLs simplify complex database queries into natural language.
  • The creation of DSLs showcases innovation in using AI for specific applications.
  • Understanding the challenges of querying complex databases is essential.
  • DSLs enhance user interactions with data by simplifying query processes.
  • The development of DSLs highlights the role of AI in data accessibility.
  • This approach provides a technical solution to common problems in data accessibility.

Bayesian updating and statistical approaches in AI

  • In-context learning in language models resembles Bayesian updating.
  • — Vishal Misra

  • Understanding Bayesian inference is crucial for grasping how LLMs process information.
  • The distinction between Bayesian and frequentist approaches affects AI model perceptions.
  • — Vishal Misra

  • The debate between these approaches impacts the reception of new research.
  • Bayesian updating provides a clear mechanism for in-context learning in LLMs.
  • This statistical concept links well-established methodologies with modern AI processes.

The Bayesian wind tunnel and model testing

  • The Bayesian wind tunnel concept allows for testing machine learning architectures.
  • — Vishal Misra

  • This concept provides a controlled environment for evaluating models.
  • Testing architectures like transformers, MAMBA, LSTMs, and MLPs is facilitated by this framework.
  • Understanding the concept of a wind tunnel in aerospace helps grasp its application in AI.
  • The Bayesian wind tunnel offers a novel framework for advancing machine learning.
  • This approach is critical for evaluating and improving AI models.
  • The controlled testing environment enhances the reliability of model assessments.
Disclosure: This article was edited by Editorial Team. For more information on how we create and review content, see our Editorial Policy.

Understanding transformers’ limitations reveals the crucial shift needed from correlation to causation for true AI advancement.

Key Takeaways

  • Transformers primarily learn correlations, not causations, limiting their ability to achieve true intelligence.
  • Achieving AGI requires models that can transition from learning correlations to understanding causations.
  • Large language models generate text by predicting the next token based on probability distributions.
  • The context provided in prompts significantly influences the output of language models.
  • Language models operate on sparse matrices where many token combinations are nonsensical.
  • In-context learning allows LLMs to solve problems in real-time using examples.
  • Domain-specific languages (DSLs) can simplify complex database queries into natural language.
  • In-context learning in LLMs is similar to Bayesian updating, adjusting probabilities with new evidence.
  • The debate between Bayesian and frequentist approaches affects the perception of new machine learning models.
  • The Bayesian wind tunnel concept offers a controlled environment for testing machine learning architectures.
  • Understanding the mechanics of LLMs is crucial for leveraging their applications effectively.
  • The transition from correlation to causation is a significant hurdle in AI development.
  • Contextual relevance in LLMs highlights the importance of prompt selection.
  • Sparse matrices in language models enhance efficiency by filtering out irrelevant token combinations.
  • The Bayesian wind tunnel provides a novel framework for evaluating machine learning models.

Guest intro

Vishal Misra is Professor of Computer Science and Electrical Engineering and Vice Dean of Computing and AI at Columbia University’s School of Engineering. He returns to the a16z Podcast to discuss his latest research revealing how transformers in LLMs update predictions in a precise, mathematically predictable manner as they process new information. His work highlights the gap to AGI, emphasizing the need for continuous post-training learning and causal understanding over pattern matching.

Understanding transformers and LLMs

  • — Vishal Misra

  • LLMs primarily learn correlations rather than causations, which limits their intelligence.
  • — Vishal Misra

  • Achieving AGI requires models that can learn causations, not just correlations.
  • — Vishal Misra

  • LLMs generate text by constructing a probability distribution for the next token.
  • — Vishal Misra

  • Understanding the mechanics of LLMs is crucial for leveraging their applications effectively.

The role of context in language models

  • The behavior of language models is influenced by the prior context provided in prompts.
  • — Vishal Misra

  • Contextual relevance in LLMs highlights the importance of prompt selection.
  • Language models operate on a sparse matrix where many combinations of tokens are nonsensical.
  • — Vishal Misra

  • Sparse matrices enhance efficiency by filtering out irrelevant token combinations.
  • The context provided can drastically change the output of language models.
  • Understanding how language models generate text based on input prompts is essential.

In-context learning and real-time problem solving

  • In-context learning allows LLMs to learn and solve problems in real-time.
  • — Vishal Misra

  • LLMs process and learn from new information through examples.
  • In-context learning resembles Bayesian updating, adjusting probabilities with new evidence.
  • — Vishal Misra

  • This mechanism is crucial for understanding the capabilities of LLMs.
  • Real-time problem solving in LLMs is enabled by in-context learning.
  • The ability to learn from examples showcases the adaptability of LLMs.

Domain-specific languages and data accessibility

  • Domain-specific languages (DSLs) convert natural language queries into a processable format.
  • — Vishal Misra

  • DSLs simplify complex database queries into natural language.
  • The creation of DSLs showcases innovation in using AI for specific applications.
  • Understanding the challenges of querying complex databases is essential.
  • DSLs enhance user interactions with data by simplifying query processes.
  • The development of DSLs highlights the role of AI in data accessibility.
  • This approach provides a technical solution to common problems in data accessibility.

Bayesian updating and statistical approaches in AI

  • In-context learning in language models resembles Bayesian updating.
  • — Vishal Misra

  • Understanding Bayesian inference is crucial for grasping how LLMs process information.
  • The distinction between Bayesian and frequentist approaches affects AI model perceptions.
  • — Vishal Misra

  • The debate between these approaches impacts the reception of new research.
  • Bayesian updating provides a clear mechanism for in-context learning in LLMs.
  • This statistical concept links well-established methodologies with modern AI processes.

The Bayesian wind tunnel and model testing

  • The Bayesian wind tunnel concept allows for testing machine learning architectures.
  • — Vishal Misra

  • This concept provides a controlled environment for evaluating models.
  • Testing architectures like transformers, MAMBA, LSTMs, and MLPs is facilitated by this framework.
  • Understanding the concept of a wind tunnel in aerospace helps grasp its application in AI.
  • The Bayesian wind tunnel offers a novel framework for advancing machine learning.
  • This approach is critical for evaluating and improving AI models.
  • The controlled testing environment enhances the reliability of model assessments.
Disclosure: This article was edited by Editorial Team. For more information on how we create and review content, see our Editorial Policy.

Loading more articles…

You’ve reached the end


Add us on Google

`;
}

function createMobileArticle(article) {
const displayDate = getDisplayDate(article);
const editorSlug = article.editor ? article.editor.toLowerCase().replace(/\s+/g, ‘-‘) : ”;
const captionHtml = article.imageCaption ? `

${article.imageCaption}

` : ”;
const authorHtml = article.isPressRelease ? ” : `
`;

return `


${captionHtml}

${article.subheadline ? `

${article.subheadline}

` : ”}

${createSocialShare()}

${authorHtml}
${displayDate}

${article.content}

${article.isPressRelease ? ” : article.isSponsored ? `

Disclosure: This is sponsored content. It does not represent Crypto Briefing’s editorial views. For more information, see our Editorial Policy.

` : `

Disclosure: This article was edited by ${article.editor}. For more information on how we create and review content, see our Editorial Policy.

`}

`;
}

function createDesktopArticle(article, sidebarAdHtml) {
const editorSlug = article.editor ? article.editor.toLowerCase().replace(/\s+/g, ‘-‘) : ”;
const displayDate = getDisplayDate(article);
const captionHtml = article.imageCaption ? `

${article.imageCaption}

` : ”;
const categoriesHtml = article.categories.map((cat, i) => {
const separator = i < article.categories.length – 1 ? ‘|‘ : ”;
return `${cat}${separator}`;
}).join(”);
const desktopAuthorHtml = article.isPressRelease ? ” : `
`;

return `

${categoriesHtml}

${article.subheadline ? `

${article.subheadline}

` : ”}

${desktopAuthorHtml}
${displayDate}
${createSocialShare()}

${captionHtml}

${article.content}
${article.isPressRelease ? ” : article.isSponsored ? `
Disclosure: This is sponsored content. It does not represent Crypto Briefing’s editorial views. For more information, see our Editorial Policy.

` : `

Disclosure: This article was edited by ${article.editor}. For more information on how we create and review content, see our Editorial Policy.

`}

`;
}

function loadMoreArticles() {
if (isLoading || !hasMore) return;

isLoading = true;
loadingText.classList.remove(‘hidden’);

// Build form data for AJAX request
const formData = new FormData();
formData.append(‘action’, ‘cb_lovable_load_more’);
formData.append(‘current_post_id’, lastLoadedPostId);
formData.append(‘primary_cat_id’, primaryCatId);
formData.append(‘before_date’, lastLoadedDate);
formData.append(‘loaded_ids’, loadedPostIds.join(‘,’));

fetch(ajaxUrl, {
method: ‘POST’,
body: formData
})
.then(response => response.json())
.then(data => {
isLoading = false;
loadingText.classList.add(‘hidden’);

if (data.success && data.has_more && data.article) {
const article = data.article;
const sidebarAdHtml = data.sidebar_ad_html || ”;

// Check for duplicates
if (loadedPostIds.includes(article.id)) {
console.log(‘Duplicate article detected, skipping:’, article.id);
// Update pagination vars and try again
lastLoadedDate = article.publishDate;
loadMoreArticles();
return;
}

// Add to mobile container
mobileContainer.insertAdjacentHTML(‘beforeend’, createMobileArticle(article));

// Add to desktop container with fresh ad HTML
desktopContainer.insertAdjacentHTML(‘beforeend’, createDesktopArticle(article, sidebarAdHtml));

// Update tracking variables
loadedPostIds.push(article.id);
lastLoadedPostId = article.id;
lastLoadedDate = article.publishDate;

// Execute any inline scripts in the new content (for ads)
const newArticle = desktopContainer.querySelector(`article[data-article-id=”${article.id}”]`);
if (newArticle) {
const scripts = newArticle.querySelectorAll(‘script’);
scripts.forEach(script => {
const newScript = document.createElement(‘script’);
if (script.src) {
newScript.src = script.src;
} else {
newScript.textContent = script.textContent;
}
document.body.appendChild(newScript);
});
}

// Trigger Ad Inserter if available
if (typeof ai_check_and_insert_block === ‘function’) {
ai_check_and_insert_block();
}

// Trigger Google Publisher Tag refresh if available
if (typeof googletag !== ‘undefined’ && googletag.pubads) {
googletag.cmd.push(function() {
googletag.pubads().refresh();
});
}

} else if (data.success && !data.has_more) {
hasMore = false;
endText.classList.remove(‘hidden’);
} else if (!data.success) {
console.error(‘AJAX error:’, data.error);
hasMore = false;
endText.textContent=”Error loading more articles”;
endText.classList.remove(‘hidden’);
}
})
.catch(error => {
console.error(‘Fetch error:’, error);
isLoading = false;
loadingText.classList.add(‘hidden’);
hasMore = false;
endText.textContent=”Error loading more articles”;
endText.classList.remove(‘hidden’);
});
}

// Set up IntersectionObserver
const observer = new IntersectionObserver(function(entries) {
if (entries[0].isIntersecting) {
loadMoreArticles();
}
}, { threshold: 0.1 });

observer.observe(loadingTrigger);
})();

© Decentral Media and Crypto Briefing® 2026.

Source: https://cryptobriefing.com/vishal-misra-transformers-learn-correlations-not-causations-the-significance-of-in-context-learning-and-the-role-of-bayesian-updating-in-ai-ai-a16z/

시장 기회
Notcoin 로고
Notcoin 가격(NOT)
$0.0003903
$0.0003903$0.0003903
+0.80%
USD
Notcoin (NOT) 실시간 가격 차트
면책 조항: 본 사이트에 재게시된 글들은 공개 플랫폼에서 가져온 것으로 정보 제공 목적으로만 제공됩니다. 이는 반드시 MEXC의 견해를 반영하는 것은 아닙니다. 모든 권리는 원저자에게 있습니다. 제3자의 권리를 침해하는 콘텐츠가 있다고 판단될 경우, crypto.news@mexc.com으로 연락하여 삭제 요청을 해주시기 바랍니다. MEXC는 콘텐츠의 정확성, 완전성 또는 시의적절성에 대해 어떠한 보증도 하지 않으며, 제공된 정보에 기반하여 취해진 어떠한 조치에 대해서도 책임을 지지 않습니다. 본 콘텐츠는 금융, 법률 또는 기타 전문적인 조언을 구성하지 않으며, MEXC의 추천이나 보증으로 간주되어서는 안 됩니다.

USD1 Genesis: 0 Fees + 12% APR

USD1 Genesis: 0 Fees + 12% APRUSD1 Genesis: 0 Fees + 12% APR

New users: stake for up to 600% APR. Limited time!