( Index )
Month

Brief Information about the March '08 CSIG Meeting

State Methods and Diagrams

C++ Version 7 and Visual Studio 2005

Written by B. Arnold

State Diagram

Welcome to the CSIG, a Special Interest Group of the ACGNJ. The subject for this month is a discussion about the "State Method" of designing or coding programs in C++. A utility will be created using this concept for converting a general text file into a "tab separated data file" suitable for importing into a MySQL database. It uses the latest C++ compiler in Microsoft's Visual Studio 2005. There are a number of ways to refer to this compiler and code. Here's what Wikipedia says:

The Common Language Infrastructure (CLI) is an open specification developed by Microsoft that describes the executable code and runtime environment that form the core of the Microsoft .NET Framework. The specification defines an environment that allows multiple high-level languages to be used on different computer platforms without being rewritten for specific architectures.

Microsoft .Net Framework 2.0
C++ 7.0
.Net 2.0
CLI
Common Language Infrastructure
Managed

Even the smallest of programs can benefit from the STATE METHOD. It is simply a way of looking at a problem and dividing it into smaller and smaller parts until each part is in its simplest form. Code can then be written for each part with little concern for the other parts. Finally all parts are combined.

As some of you may know, I am the webmaster for a Bible website that shows many Bibles. It was requested that I add the Douday Rheims version to the database. I had a PDF file which was 1500 pages long. Adobe lets you write a PDF to a TXT file which becomes about 100,000 lines. The source file consisted of multi-line paragraphs for each verse with a page footer containing the Book and Chapter. In the target database, a record consists of four (4) fields: Book, Chapter, Verse Number, and Verse Text. The MySQL database accepts for input a single tab delimited line for each record. The problem was to convert the data from one form to the other form. Thus there would be over 100, 000 conversions.

The problem was defined in terms of a STATE DIAGRAM. Using VS2005 in text mode I then mapped the State Diagram into C++ code to accomplish the conversion. By the way, the running time of the program is about 5 seconds.

At the meeting, I will present a program that I have written using this concept. Although the program is only a few pages long, it still benefits from this concept. The "bottom line" is that a program can usually be written faster and with fewer errors using this concept. Additionally, debugging becomes easier.

Sample Code

int Process(String ^infilename)
{
    int verse_no=0, chapter=1, page=0, lookAheadPg=0, idx;
    String ^ verse_text = String::Empty, ^ book = String::Empty;
    enum {SCANNING, LINE_W_VERSE_NO, LINE_W_VERSE_TEXT,
        LINE_BLANK, LINE_FOOTER, LINE_PAGE, LINE_FF } state = SCANNING;

#define DOUDAY_RHEIMS   "The Douday Rheims Bible"
#define MYSQL_FORMAT    "{0,-16}\t{1,3}\t{2,3}\t{3}\t"

    if (!File::Exists(infilename)) 
    {
        Console::WriteLine("{0} does not exist.", infilename);
        return 1;
    }
    StreamReader ^sr = File::OpenText(infilename);      // Work file.
    StreamReader ^sr2 = File::OpenText(infilename);     // Look ahead file.
    String ^input, ^input2;
    int token1, line_count = 0, pages = 0;
    while ((input=sr->ReadLine())!=nullptr) 
    {
        ++line_count;
        // test code -- if (input->Contains("Douday Rheims")) Console::WriteLine(input);

        if (Int32::TryParse(input, token1))         state = LINE_W_VERSE_NO;
        else if (input->Length == 0)                state = LINE_BLANK;
        else if (input->StartsWith("Page"))         state = LINE_PAGE;
        else if (input->Contains(DOUDAY_RHEIMS))    state = LINE_FOOTER;
        else if (input->Contains("\f"))             state = LINE_FF;

        switch (state)
        {
. . . .
        case LINE_W_VERSE_TEXT:
            verse_text += input;
            verse_text += " ";
            break;

        case LINE_BLANK:
            break;

        case LINE_FOOTER:
            idx = input->IndexOf(DOUDAY_RHEIMS);
            book = input->Substring(0, idx);
            chapter = GetLastInt32(book);
            book = StripLastToken(book);
            state = SCANNING;
            // Special case follows since some (!) footers have page nos.
            if (input->Contains(" Page "))
            {
                idx = input->IndexOf(" Page ") + 6;
                page = GetFirstInt32(input->Substring(idx));
            }
            break;

        case LINE_FF:
            if (verse_text->Length > 0)
            {
                Console::WriteLine(MYSQL_FORMAT, book, chapter, verse_no, verse_text );
                verse_text = String::Empty;
            }
            state = SCANNING;
            break;
        }
    }
    sr->Close();
    Console::WriteLine("{0} lines read.", line_count);
    return 0;
}

"Random Access" questions start at 7:30 Tuesday night.

SOURCE CODE

Source Code Files

For help, email me at b a r n o l d @ i e e e . o r g
Back to C++ Main Page