Boost C++ Libraries Home Libraries People FAQ More

PrevUpHomeNext

Examples

DNA string
Reverse complement
Inexact searches (or distance searches)

Uses basic_dna_string with most efficient C++ type.

dna_string is a container class like std::string that efficiently stores raw DNA sequences without markup.

The DNA bases A, C, G, T are represented using two bits-per-base and are accessed 64-bits at a time or more depending on architecture.

Having a packed representation makes it possible to use bit-twiddling tricks to perform linear searches faster and reduce level 1 and 2 cache loads.

void simple_substring_search()
{
    using namespace boost::genetics;

    // Make a DNA string. Only A, C, G or T allowed.
    dna_string str("ACAGTACGTAGGATACAGGTACA");

    // Search the string for a substring.
    dna_string GATACA = "GATACA";
    size_t gataca_pos = str.find(GATACA);
    if (gataca_pos != dna_string::npos) {
        std::cout << "'GATACA' found at location " << gataca_pos << "\n";
    }
} // void simple_substring_search()

PrevUpHomeNext