LLM Proofs
Description
Large language models without explicit training on formal logic cannot construct a valid novel proof in Zermelo–Fraenkel set theory containing more than fifty inference steps.
Falsification Criteria
Provide a reproducible example of such a model generating and verifying a new valid proof meeting the criterion.
AI Feedback
1. Brief critique and context
The conjecture posits that large language models (LLMs) lack the capability to generate complex proofs in Zermelo–Fraenkel set theory without explicit formal logic training. This challenges the inherent reasoning abilities of LLMs, which are generally trained on massive datasets without specific logical structuring. The complexity of formal proofs, especially those exceeding fifty inference steps, requires a deep understanding of logical consistency and structure, which may not be fully captured by LLMs trained primarily on natural language data.
2. Recent research
Recent studies have shown advancements in LLMs' capabilities in mathematical reasoning, but significant limitations remain. For instance, the paper "Mathematics with Transformers" (https://arxiv.org/abs/2012.14465) explores LLMs' abilities in solving mathematical problems but acknowledges challenges in complex proof generation. Additionally, "Language Models as Zero-Shot Planners: Extracting Actionable Knowledge from Text" (https://arxiv.org/abs/2101.06286) highlights that while LLMs can perform tasks resembling logical reasoning, they struggle with extended logical consistency and novel proof generation without targeted training.
3. Bayesian likelihood of falsification (with reasoning)
20% likelihood of falsification within 5 years.
While LLMs have shown improvements in handling structured problems, the specific challenge of generating novel, valid proofs in Zermelo–Fraenkel set theory involves sophisticated logical reasoning typically beyond their current capabilities. Unless there are significant advancements in integrating formal logic training into LLM architectures or breakthroughs in reasoning algorithms, the conjecture is unlikely to be falsified within the specified timeframe.
Bounty
Contribute to the bounty for anyone who can successfully refute this conjecture
You must be signed in to contribute to the bounty.
Sign inRefutations
Rational criticism and counterarguments to this conjecture
No refutations have been submitted yet.
Be the first to provide rational criticism for this conjecture.
You must be signed in to submit a refutation.
Sign in
Sign in to join the discussion.