Popper - Conjectures and Refutations

1. Brief critique and context
The conjecture posits that large language models (LLMs) lack the capability to generate complex proofs in Zermelo–Fraenkel set theory without explicit formal logic training. This challenges the inherent reasoning abilities of LLMs, which are generally trained on massive datasets without specific logical structuring. The complexity of formal proofs, especially those exceeding fifty inference steps, requires a deep understanding of logical consistency and structure, which may not be fully captured by LLMs trained primarily on natural language data.

2. Recent research
Recent studies have shown advancements in LLMs' capabilities in mathematical reasoning, but significant limitations remain. For instance, the paper "Mathematics with Transformers" (https://arxiv.org/abs/2012.14465) explores LLMs' abilities in solving mathematical problems but acknowledges challenges in complex proof generation. Additionally, "Language Models as Zero-Shot Planners: Extracting Actionable Knowledge from Text" (https://arxiv.org/abs/2101.06286) highlights that while LLMs can perform tasks resembling logical reasoning, they struggle with extended logical consistency and novel proof generation without targeted training.

3. Bayesian likelihood of falsification (with reasoning)
20% likelihood of falsification within 5 years.
While LLMs have shown improvements in handling structured problems, the specific challenge of generating novel, valid proofs in Zermelo–Fraenkel set theory involves sophisticated logical reasoning typically beyond their current capabilities. Unless there are significant advancements in integrating formal logic training into LLM architectures or breakthroughs in reasoning algorithms, the conjecture is unlikely to be falsified within the specified timeframe.

LLM Proofs

Description

Falsification Criteria

AI Feedback

Bounty

Refutations

Discussion