DetectEquivalentPrograms Decision Problem

Ainslee Plesko, Coltin Colucci, Alexander Goddard, and Preston Smith

September 8, 2025

Introduction

  • Goal: Build a Equivalent Program Detection System
  • Input: Program 1 and Program 2
  • Output: “Yes” or “No”
  • Motivation: Program runtime efficiency

Theory of computation

  • What is theory of computation?
    • Understanding what can be computed
    • Analyzing computational complexity
    • Proving limits of computation
    • “Proofgrammers” combine proofs and programming

Equivalence Detector

How does it work

  • detect_equivalence(P1: str, P2: str):
    • get_syntax_tree(program_one, program_two)
    • standardize_naming(source: ast):
    • def same_string(str_one: str, str_two: str):
    • def same_error(error_one: Exception, error_two: Exception):

Takes two programs as strings and returns yes if it finds there equivalent no otherwise

Standardize Naming Function

def standardize_naming(source: ast) -> ast.AST:
    """
    Standardizes all variable, function, and class names in the AST to numbered placeholders.
    This helps ignore differences in naming conventions when comparing program structure.
    """
    used_name = {}  # Maps original names to standardized numbered names
    name_id = 1  # Counter for generating new names
    # Walk through all nodes in the AST
    for node in ast.walk(source):
        # Standardize variable names
        if isinstance(node, ast.Name):
            if node.id in used_name:
                node.id = used_name[node.id]
            else:
                name_id += 1
                used_name[node.id] = str(name_id)
                node.id = str(name_id)
        # Standardize function and class names
        if isinstance(node, (ast.FunctionDef, ast.ClassDef)):
            if node.name in used_name:
                node.name = used_name[node.name]
            else:
                name_id += 1
                used_name[node.name] = str(name_id)
                node.name = str(name_id)
        # Standardize function argument names
        if isinstance(node, ast.arg):
            if node.arg in used_name:
                node.arg = used_name[node.arg]
            else:
                name_id += 1
                used_name[node.arg] = str(name_id)
                node.arg = str(name_id)
    return source

Output of Standardize Naming

Input

def calculate_sum(numbers):
  toal = 0
  for n in numbers:
    toal += n
  if toal > 10:
    print("Large sum")
  else:
    print("Small sum")
  return toal

calculate_sum([1, 2, 3, 7])

Output

def 2(3):
    4 = 0
    for 5 in 3:
        4 += 5
    if 4 > 10:
        6('Large sum')
    else:
        6('Small sum')
    return 4
2([1, 2, 3, 7])

Detect Equivalence Program

Tractable vs Intractable vs Uncomputable

  • Tractable
    • “can be solved efficiently”
    • computable in theory AND in practice
  • Intractable
    • “method for solving exists, but is hopelessly time consuming”
    • computable in theory BUT (maybe) not in practice
  • Uncomputable
    • “cannot be solved by any computer program”
    • NOT computable in theory or practice

Tractable vs Intractable vs Uncomputable (Part 2)

  • Tractable (in restricted cases)
    • runs in polynomial time
      • “The complexity class Poly is the set of computational problems can be solved by a Python program with running time in \(O(nᵏ)\), for some \(k≥0\).”
  • Uncomputable (in the general case)
    • runs in polynomial time
      • as the input grows in complexity, so does the program complexity
      • depending on input (specifically with infinite input spaces), program will become inefficient to run

Limitations

  • The major limitation of our program is that the programs need to have the same structure
    • If the two programs have different structures but will produce the same results for all inputs we would have a false negative
  • If you change the inequality of program P1 from \(toal > 10\) to \(10 < toal\) you would now get a negative even though the programs are still equivalent

Why It’s Not Possible in the General Case

  • There are an infinite number of ways to structure a program that will get the same results so it’s unfeasible to make workarounds for structures being different but equivalent.
  • It’s not feasible to run every possible input through both programs as it would take an infinite amount of time to get through an infinite amount of inputs.
  • If an input never halted on one of the programs you would never be able to compare the output with the other program.

Conclusion

  • detect_equivalence works by running both programs on the same input and checking if they both produce the same output or fail in the same way.
  • This approach only works for restricted cases, in the general case determining program equivalence is uncomputable.
  • Our implementation works for simple programs where all possible inputs can be tested but it cannot handle complex programs with infinite input spaces.
  • This challenge demonstrates that while tools can help with specific cases, fully determining program equivalence for all programs is uncomputable.