Text Error Correction System
Aidan Dyga, Ainslee Plesko, Grant Anderson, Issei Hasegawa
September 1, 2025
Introduction
Goal:
Build a Text Error Correction System
Input:
Original (un-mutated) text file
Output:
Encoded text that can detect errors after mutation/transmission
Motivation:
Ensure data integrity over unreliable channels
Problem
Problem Statement:
Data can be corrupted during transmission/storage
Need a way to detect (and possibly correct) errors in text
Challenges:
Random mutations (bit flips, character changes)
Efficient encoding/decoding
Example Solution
Encode each line
Every line is duplicated and encoded in a more secure manner to prevent mutation
Message Decoding
After the transfer each line is decoded for comparison
Line Verification
Each line is compared with its encoded variant and corrected where necessary
Python Implementation
Why is this a Tractable Problem?
This problem is Tractable because it can be solved efficiently in linear time with respect to the input size.
Linear Time Complexity
The system runs in time proportional to the input size (O(n·L)), so it scales efficiently.
Local Verification
Errors are detected and corrected per line using checksums and redundancy, avoiding complex global computation.
Feasible Error Model
With simple mutation errors (like single-copy corruption), correction is straightforward and does not require intractable algorithms.
Limitations
Homophones
The program may not be able to correct words that are spelled correctly but used in the wrong grammatically context.
Because they are spelled correctly, they may not get flagged by the checker.
Ex: ‘their’ vs ‘there’
Numerical Errors
Because the corrector is based on textual errors, like spelling, numerical differences may not be caught.
Ex: ‘1889’ vs ‘1989’
Incorrect Proper Nouns
Unless a proper noun is included within the program’s dictionary, it migt get flagged as an error.
A mispelled proper noun that spells as another word may also be missed.
Structural or Logical Errors
Despite its ability to check at a sentence level, it would not be able to check coherance or flow of an entire paragraph.
Conclusion
Text error correction is essential for reliable communication
Simple codes can efficiently detect errors
Tractable due to low computational complexity
More advanced codes can also correct errors, not just detect