Why Python Devs Should Learn Lean 4 for Code Correctness

Python’s dynamic nature allows developers to write concise, expressive code quickly. Whether automating workflows, crunching datasets, or building APIs, the language’s flexibility often accelerates progress. Yet, many Python programmers eventually face a persistent question: How can I be sure my code actually works as intended?

You can write unit tests, add type hints, or run static analyzers. But tests only check specific scenarios, and type hints in Python are suggestions, not guarantees. That’s where Lean 4 enters the picture—a language that doesn’t just run code but proves its correctness.

Lean 4 is more than a programming language. It combines four powerful roles into one system:

A theorem prover to verify mathematical claims
A proof assistant to guide logical reasoning
A functional programming language for building reliable software
A framework for mathematically verifying systems

For Python developers, Lean’s strict approach may feel unfamiliar at first. But once the mental model clicks, it reshapes how you think about programming entirely. This guide introduces Lean 4 specifically for those coming from Python—no prior theorem-proving experience required.

Lean Prioritizes Precision Over Flexibility

Python thrives on its permissive design. Consider a simple function:

def add(a, b):
    return a + b

This works in Python, but critical details remain unanswered:

Are a and b integers, strings, or something else?
What happens if one is a list and the other is a float?
When does the operation fail?

Lean eliminates ambiguity by forcing developers to declare types and constraints upfront. For example, the equivalent function in Lean looks like this:

def add (a : Nat) (b : Nat) : Nat :=
  a + b

This explicitly states:

a and b must be natural numbers (Nat)
The return type is also a natural number

At first glance, this may seem verbose compared to Python. But Lean’s philosophy is clear: Ambiguity is often the root of hidden complexity. By requiring explicit declarations, Lean prevents subtle bugs from creeping into your code before runtime. It shifts the burden of correctness from testing to design.

Writing Your First Lean Functions

Let’s compare Python and Lean by converting familiar functions. Both languages can compute squares, but their approaches highlight key differences.

In Python:

def square(x):
    return x * x

The Lean equivalent is strikingly similar:

def square (x : Int) : Int :=
  x * x

Now let’s tackle something more complex: calculating factorials. Python’s version uses recursion with a conditional:

def factorial(n):
    if n == 0:
        return 1
    return n * factorial(n - 1)

Lean’s factorial looks different but accomplishes the same task:

def factorial : Nat → Nat
  | 0 => 1
  | n + 1 => (n + 1) * factorial n

The syntax may feel foreign at first. Here’s what’s happening:

The function uses pattern matching to handle two cases: when the input is 0 or a successor of another natural number (n + 1)
For 0, it returns 1 (base case)
For n + 1, it multiplies (n + 1) by the factorial of n (recursive case)

This style is common in functional programming. More importantly, Lean can formally verify recursive definitions, ensuring they terminate and behave as expected. Python relies on runtime checks; Lean enforces correctness at the language level.

Lean’s Strictness Prevents Common Pitfalls

Python’s leniency can lead to runtime errors that feel unavoidable. Consider a simple division function:

def divide(a, b):
    return a / b

This function works fine—until someone calls divide(10, 0). Python only raises an exception when the code executes, leaving room for bugs to slip through testing. Lean addresses this by categorizing numeric types deliberately:

Nat for non-negative integers
Int for all integers
Rat for rational numbers
Real for real numbers

Each type has distinct rules. For instance, division in Lean might require proving the denominator isn’t zero before the operation can proceed. This isn’t just syntax—it’s a way to encode logical guarantees directly into your code. Lean assumes that if something matters logically, it should be represented explicitly.

Immutability in Lean Simplifies Reasoning

Python developers often rely on mutable state to track changes over time. For example:

count = 0

def increment():
    global count
    count += 1

This pattern works in small scripts, but proving correctness in larger systems becomes nearly impossible. Lean discourages mutable state by design, favoring immutable transformations instead:

def increment (n : Nat) : Nat :=
  n + 1

Instead of modifying an existing variable, this function takes an input and returns a new value. Why does this matter? Because mutable state introduces complexity that’s hard to model mathematically. Imagine trying to prove a large system correct where:

Variables change unpredictably
Functions have hidden side effects
Execution order affects outcomes

Functional programming, as seen in Lean, reduces this chaos by treating state as a series of transformations. Each function call becomes a predictable step in a logical proof, making verification feasible.

Lean’s Types Go Beyond Python’s Hints

Python’s type hints improve code readability and IDE support, but they’re not enforced strictly. Lean’s type system, however, serves a deeper purpose: it enables mathematical verification. Python types are helpful suggestions; Lean types can encode logical truths.

For example, Python allows you to hint that a parameter should be a string:

def greet(name: str) -> str:
    return "Hello " + name

But Lean can express much stronger guarantees. Consider a function that requires a non-empty list:

def safe_head (xs : List α) (h : xs ≠ []) : α :=
  xs.head h

Here, the type (h : xs ≠ []) ensures the list isn’t empty at compile time. Python would only catch this error at runtime, if at all. Lean’s type system can also express:

Numbers that are strictly positive
Functions that always terminate
Data structures that meet specific invariants

This is where Lean stops feeling like traditional programming and starts resembling formal reasoning.

Proofs in Lean Feel Like Interactive Debugging

One of Lean’s most surprising aspects is how writing proofs resembles debugging. Suppose you define a function to compute the length of a list:

def length : List α → Nat
  | [] => 0
  | _ :: xs => 1 + length xs

This function handles two cases:

An empty list returns 0
A non-empty list (denoted by _ :: xs) increments the length by 1 and recurses

Now, imagine proving a property about lists, such as the length of a reversed list remains unchanged. In Lean, you’d write:

theorem reverse_length (xs : List α) : (xs.reverse).length = xs.length := by simp

Lean doesn’t just run the code—it interactively guides you through the proof process. You’ll see:

The current assumptions
The proof goals remaining
Suggested next steps

This creates a workflow akin to Python’s REPL-driven development, but instead of debugging runtime behavior, you’re debugging logical reasoning. It’s a shift from does this work? to can I prove this works?

Tactics: Building Proofs Step by Step

Lean proofs are constructed using tactics—small, reusable steps that guide the proof assistant. For beginners, several tactics are particularly useful:

simp simplifies expressions using known rules
rw rewrites terms using equalities
intro introduces assumptions
exact provides a complete proof directly
apply uses a theorem to solve the current goal

Let’s revisit the earlier example of reversing a list twice:

theorem reverse_reverse (xs : List Nat) : xs.reverse.reverse = xs := by simp

Here, simp tells Lean to simplify the expression using standard library rules. Under the hood, Lean applies known theorems about list reversal, proving the statement automatically. For more complex proofs, you might chain multiple tactics together:

theorem add_zero (n : Nat) : n + 0 = n := by rfl

This theorem states that adding zero to any natural number leaves it unchanged. The proof uses rfl, which stands for reflexivity—a tactic that proves goals where both sides are definitionally equal. While this example is trivial, the same mechanism scales to verify complex systems like compilers, operating systems, or mathematical proofs.

A New Perspective on Programming

Lean 4 isn’t a replacement for Python. It’s a complementary tool for situations where correctness matters more than flexibility. Python remains unmatched for rapid prototyping, data analysis, and scripting. But when you need to prove your software works as intended—whether for mission-critical systems, mathematical research, or educational tools—Lean offers a rigorous alternative.

For Python developers, the learning curve may feel steep at first. But the payoff is a deeper understanding of programming as a discipline rooted in logic and precision. As formal methods gain traction in industry, skills in theorem proving and functional programming become increasingly valuable. The next time you find yourself writing a test that feels incomplete, consider Lean. It might just change how you think about code—and correctness—forever.

AI summary

Python’s flexibility speeds up development, but verifying correctness remains a challenge. Lean 4 merges theorem proving with functional programming to help developers write provably correct code.