Brock TR # CS-05-07 Abstract

Bounds on Optimal Edit Metric Codes    [PDF]
S.K. Houghten, D. Ashlock and J. Lennarz, July 2005.

The edit distance between two strings is the minimal number of substitutions, deletions, or insertions required to transform one string into another. An error correcting code over the edit metric includes features from deletion-correcting codes as well as the more traditional codes de- fined using Hamming distance. Applications of edit metric codes include the creation of robust tags over the DNA alphabet. While codes over the edit metric are analogous to similar codes over the Hamming metric, little of the beautiful theory survives. The block structure of a word is its partition into maximal subwords composed of a single character. The size of a sphere about a word in the edit metric is heavily dependent on the block structure of the word, creating a substantial divergence from the theory for the Hamming metric.

This paper explores the theory underlying edit metric codes for small alphabets. An optimal code is one with the maximum possible number of codewords for its length and minimum distance. We provide tables of bounds on code sizes for edit codes with short length and small alphabets. We present several heuristics for constructing codes.