Figure 2.
Information-theoretic model of OGM as a random codeword transmitted over a noisy communication channel. (a) The OGM communication channel flowchart. (b) Example model parameters, explained in Section 2.1. B is the bin size, L is the DNA fragment length, n is the number of bins in the DNA fragment, and G is the genome length (here it’s very short, for illustration). The number of possible position offsets of a codeword in the binned genome is the number of messages (or codebook size) M. Also shown are the bin counts of pattern occurrences in the genome sequence (x) and of the labels in the DNA fragment image (y).