1. Using Flavor with C++

1.1 The Basics: "Hello Bits"

In the long-standing tradition of providing "Hello World" programs in regular programming languages and programming environments, we will first show how to write an equivalently simple program using Flavor. Instead of text output, however, we will write a program that reads a file, one character at a time. Of course, this is something where Flavor isn't really necessary, but it will help you get started and you can create sample data on your own very easily using a text editor.

Let's first define a description for a text (ASCII) file using Flavor. Such a file is just a sequence of 8-bit characters. Let's first consider each character as a separate object. The following is a Flavor description for such an object.

HelloBits.fl
// both C and C++ comments are allowed
class HelloBits {
    char(8) c;
}; // <-- the trailing ';' is optional

This declares an class called HelloBits, containing a single variable c. The variable is of type char, and is represented in the bitstream using 8 bits. Flavor supports all C++ types, as well an additional type named bit, used for bit strings. Bit strings are defined using '0b' notation, such as 0b001. (similar to '0x' for hexadecimal numbers). An optional period can be used every four bits to enhance readability (e.g., 0b00.0111). In contrast to regular programming languages, the length of a value is equally important to the value itself. As a result, bit strings also convey their length in addition to their value. The same is true for hexadecimal or octal numbers. For the C++ programmer, this semantic distinction is irrelevant: variables of type bit can be considered equivalent to unsigned integer

Save this in a file called HelloBits.fl and run the translator on it by typing: flavorc HelloBits.fl. The translator will create a sample1.h file which contains the declaration of a class called HelloBits. This class will have just a single member variable (char c;) and two methods: get() and put(). These methods are responsible for getting data from a file and placing it in the classs variables and also for taking data form these variables and placing them in a file.

In order for these two methods to work, however, you need to provide them with information on where to write or read their data. A trivial way to do that would be to just pass along to the HelloBits class or its methods a handle to a file (e.g., a file descriptor or FILE pointer). While this would certainly work, it would seriously limiting. Consider for example the case where data is to be drawn from (or written to) a network connection. Clearly, the semantics of a file (e.g., in terms of error conditions) as well as buffering issues would be different from a network connection (i.e., socket). Similarly, what if you wanted to write a multithreaded program that would read from the same source concurrently? For these reasons, there is an extra layer between the translator and your code. This layer is the run-time library.

This library consists of the definition of a class called Bitstream. It provides elemental functions that the translator relies upon for implementing the get() and put() methods. A reference to a Bitstream is actually required in both methods. The benefit of using such a class is that a programmer can easily replace it (or derive from it) to implement the specific I/O architecture desired. As long as the replacement provides an identical interface, the translator will always generate correct code. More information on the run-time library is provided below.

Here is how the HelloBits class is declared in the generated code.

HelloBits C++ Class
class HelloBits {
public:
    char c;
    int get(Bitstream& bs);
    int put(Bitstream& bs);
};

If you look at the actual generated file, there is additional information that we will examine later on, but this is the basic interface between the Flavor-generated code and your own code: variables that have a parse size specification (called parsable variables) become class members, and you also have the translator-generated get() and put() methods. All this information is declarared public (we will see later on how this can be changed).

Let's now return to our example, and write a simple C++ program that will use the newly defined Flavor object. Here is a minimal program.

Using HelloBits.fl in C++ Code
#include <stdio.h>

// always include this
#include <flavor.h>

// include flavorc-generated code
#include "HelloBits.h"

int main(int argc, char *argv[])
{
    // check that we have an argument
    if (argc!=2) {
        fprintf(stderr,"Usage: %s data\n", argv[0]);
        exit(1);
    }

    // our input bitstream
    Bitstream bs(argv[1], BS_INPUT);

    // our Text object
    HelloBits h;

    // get the data
    h.get(bs);

    // print them
    printf("HelloBits.c: %c\n", h.c);

    // done
    exit(0);
}

Notice first that included the run-time library header file flavor.h. This just includes the associated header file defining the Bitstream class. If you were to write your own Bitstream class, you could safely ignore it. Second, we included the flavorc-generated file HelloBits.h. Inside the main() program, we first check that we have an argument. If so, we create a Bitstream object using the argument as the file name. Note that an additional argument is passed to the Bitstream constructor to indicate that this will be an input bitstream. For an output bitstream we would use the identifier BS_OUTPUT. We are then ready to declare our HelloBits object, and call its get() method.

If you compile and run this program, it will print the very first character of the file you provided as an argument. Note that in order to properly compile it, you must specify to your compiler the paths to the Flavor include directory, the Flavor library directory, and also request that the Flavor library itself (libflavor.a for UNIX or flavor.lib for Win32) is linked to your executable.

1.2 Handling Input and Output: "Hello More Bits"

What you would probably want to do, however, is to read the entire contents of the file. There are several ways to approach this. One  is to consider each character as a separate object; another to consider the entire file as an object.

One Character-One Object

In the first approach, you would wrap around the get() call in a while loop. This would read data continuously until the end of file is reached. But how do you know that the end has been reached? In other words, what should be the terminating condition for your loop?

Most multimedia representation formats include an end-of-data indicator. As we are dealing with text files here, we cannot rely on a particular end of file marker. This means that the terminating condition is really the end of file encountered by the Bitstream class. This condition needs to be communicated somehow to your program.

An important thing to note is that the translator only handles bitstream syntax-related errors (discussed later on). It does not know and does not need to know anything about the internals of the actual I/O operation. The latter is directly handled by the Bitstream class and hence can be fully customized if needed.

In the available run-time library, there are two modes for error reporting: C++ exceptions, and traditional error query. The mode is determined by a compile-time option in the library (USE_EXCEPTION, defined in include/port.h). The preferred mode is exceptions, as it is more elegant and efficient. Unfortunately, almost all UNIX C++ compilers (including GNU) do not yet support exceptions even though the C++ standard has included them for some time now. As a result, exceptions are disabled in all UNIX distributions. If your compiler happens to support it (e.g., Sun's on Solaris), you will need to rebuild the run-time library after modifying the relevant line in port.h. The traditional mode involves querying the Bitstream class for its error status after the get() (or put()) call returns.

Here is the modified code from our previous example, this time reading the entire file, using exceptions. .

"HelloBits" Using Exceptions
    HelloBits h;

    // get the data
    try {
        while (1) {
            h.get(bs);
            printf("HelloBits.c: %c\n", h.c);
        }
    }
    catch (Bitstream::EndOfData e) {
        exit(0); // end of file reached
    }
    catch (Bitstream::Error e) {
        // print error message
        fprintf(stderr, "%s: Error: %s\n", argv[0], e.getmsg());
        exit(1);
    }

EndOfData and Error are classes defined in the run-time library as part of Bitstream (see Run-Time Library).

Here is the same example, this time using the traditional method.

"HelloBits" Using Standard Error Reporting
    HelloBits h;

    // get the data
    while (1) {
        h.get(bs);
        if (bs.geterror() == E_NONE) {
            printf("HelloBits.c: %c\n", h.c);
        }
        else if (bs.geterror() == E_END_OF_DATA)
            exit(0); // end of file reached
        }
        else {
            // print error message
            fprintf(stderr, "%s: Error: %s\n", argv[0], e.getmsg());
            exit(1);
        }
    }

E_NONE and E_END_OF_DATA are defined in Bitstream.h (see Run-Time Library). The major difference that you should be aware of is that the traditional approach delays error reporting. As the translator itself does not check for errors when calling the relevant members of the Bitstream class, Flavor objects may continue trying to read (or write) even after an error occurs. This is not a serious problem, as in most cases the bitstream syntax will fail, and the translator's own error reporting capabilities (discussed below) will get into action.

The reason why the translator was designed no to use error reports from the Bitstream class is efficiency: no need to check return arguments and one parameter less to pass to the I/O functions. Hopefully very soon UNIX compiler implementors will catch up with the C++ specification and will support exceptions on their platforms.

The code above may look a bit lengthy, but we should note that it is the same regardless of the complexity of the HelloBits object. Even for objects where the end of data can be detected by the syntax, it is always a good idea to check for end of data conditions so that broken files are easily identified as such.

You can combine the two approaches together using #ifdef USE_EXCEPTION, so that a single source file will work with both environments (all distributed samples are written this way).

All Characters in One Object

As we mentioned earlier, an alternative approach is to consider the entire file as an object. This means that we must modify the syntax for our HelloBits object. Here is a possible modification.

HelloFile.fl
class HelloFile {
    while (1) {
        char(8) c;
    }
}

Here we wrapped the declaration of c in an endless while loop. Note that, in contrast with traditional C++ and Java, Flavor allows actual code to be written together with the class declaration (all your familiar C++/Java for, do, while, switch, etc. statements are supported). Remember that the whole purpose of Flavor is to properly declare the parsable variables, as this is what defines the representation format. As a result, Flavor also does not support class methods or functions.

The loop above will not, of course, really be endless; as the translator makes calls to the Bitstream class to obtain values for c, it will eventually trigger an end-of-file condition. This can be picked up by the user's C++ code as we show earlier.

The problem here is that we have no way of printing the value for c. When get() returns, c will contain just the last character of the file as the while loop will be embedded in the get() method. There are several ways to solve this: verbatim code, tracing, and arrays.

1.3 Verbatim Code

First, we can use Flavor's verbatim code feature to insert our own C++ code in the flavorc-generated code. There are four different types of verbatim code:

  1. Class declaration code. This is introduced using the delimiters %{ and %} and will be inserted by the translator in the class declaration (or in the global scope, outside of any class).
  2. Code that should go to both the get() and the put() method. This is introduced using the delimiters %*{ and %*}.
  3. Code that should go only to the get() method. The delimiters are %g{ and %g}.
  4. Code that should go only to the put() method. The delimiters here are %p{ and %p}.

In our case the third type is the one needed. Here is how the modified Flavor source will look like.

HelloFile.fl
class HelloFile {
    while (1) {
        char(8) c;
        %g{ printf("HelloFile.c: %c\n", c); %g}
    }
}

This will cause the printf to be called whenever a new character c is read from the file. Note that, using verbatim code, we can define additional variables or methods for our class. Also, we can switch our variables from public to protected or private. Here is how.

HelloFile.fl with Private Variables
class HelloFile {
%{ private: %} // use private variables
    while (1) {
        char(8) c;
        %g{ printf("HelloFile.c: %c", c); %g}
    }
}

Note that the get() and put() methods are always declared public.

1.4 Automatic Tracing

Printing information about a bitstream's content turns out to be extremely useful, particularly during development and debugging. Also, when two or more separate organizations are developing a specification and pursuing independent implementations, it is very useful to have traces of the files generated by each tool for comparison and debugging purposes. One could of course insert verbatim printf statements for all parsable variables, but this can quickly get out of hand.

The translator can automatically generate tracing code and insert it in the get() method (using the command line option -t). This means that, without modifying your program or Flavor source in any way, you can automatically create detailed traces of your files. Here is a trace from the above HelloFile.fl specification, when run on this HTML file.

Automatically Generated Trace from HelloFile.fl
   At Bit  Size    Value    Description
        0:                  begin HelloFile
        0:    8          3C c (60)
        8:    8          68 c (104)
       16:    8          74 c (116)
       24:    8          6D c (109)
       32:    8          6C c (108)
       40:    8          3E c (62)
       48:    8          0D c (13)
       56:    8          0A c (10)
       ...

As you can see, the trace output includes in each line: the bit position (starting from the beginning of the bitstream), the size of the quantity read, its value in hexadecimal, and a description. The first line indicates 'begin HelloFile', signalling the entry in the get() method of HelloFile. All other lines include the parsed c variable. The trace also includes the decimal value of the variable. If your ASCII is in top shape, you may be able to decipher here the string '<html>\r\n'.

The tracing output is directed to the standard output. Note that the translator does not print out the trace directly. Instead, it calls a quite simple trace function which is part of the run-time library. This means that you can very easily use your own tracing functions to customize both the output destination as well as formatting (more information is provided in the Run-Time Library).

1.4 Arrays

While the preceding techniques completely solved our output problem, the fact that our HelloFile object can only contain just a single character is certainly limiting. The best solution is of course to use an array. Since we don't know the size of the array in advance, we will have to use partial arrays and load one element at the time. Here is the relevant Flavor code.

HelloFile.fl with Arrays
class HelloFile {
    while (1) {
	int i;
        char(8) c[[i++]];
    }
}

The double brace notation indicates that this is declaration of just one element of the array. We use the variable i to load each element of the array in the right position. This, however, only partially solves the problem of the unknown file size. While Flavor deals with arrays of dynamically varying sizes, the translator generates arrays of a fixed size. This generates faster code, and also avoids problems with inconsistent handling of fundamental and derived types.

The default array size used by the translator is 64, and it can be changed via the -a command line option. Also, when the array size in a Flavor expression is a constant, the translator can automatically check the value and increase the array size if needed (it will also issue a warning).

Essentially all media representation formats of any practical interest do not use such open-ended structures. In fact, even a text file itself can be considered as composed of a set of lines, rather than a large collection of characters. Here is a modification of HelloFile.fl to read only one line of information.

HelloLine.fl
class HelloLine {
    int i=0;
    do {
        char(8) c[[i]];
    }
    while (c[i++] != '\n');
}

Observe that the double brace notation can only be used when declaring partial arrays; in all other places you can use the familiar sing-brace notation from C++ and Java. With this modification our individual objects can have a more manageable size. We can also redefine our HelloFile to use an array of lines, instead of characters.

HelloFile using HelloLine
class HelloFile {
    int i=0;
    while (1) {
        HelloLine line[[i++]];
    };
}

Here the file is considered as an array of line objects; as before, the size of the array is dynamic.

Let's consider now the case where the Flavor source contains more than one array, and that all of them are small except one that is very large. For example, assume that you have files with very short lines but that contain a very large number of lines. Based on the above, we will have to use an array size suitable for the longest array contained in the media representation. This, however, will significantly increase the memory requirements for the C++ program as all arrays would have to use this size. The solution to this problem is to be able to individually specify the maximum array size when needed. This, among other things, is accomplished using pragma statements.

1.5 Pragma Statements

Pragma statements are similar to those found in C or C++ preprocessors. The contain statements that set options to the compiler/translator itself, rather than generating any actual code. Pragma statements are introduced in Flavor files using the %pragma directive. They become effective at the exact place where they appear. If they are inside a class, they affect both that class as well as all subsequent classes.

Almost all command-line options can be set using pragma statements. Here are a few examples.

Example Pragma Statements
// generate put, do not generate get, generate trace, and set array to 1024
%pragma put, noget, trace, array=1024

// generate put and get, and use custom tracing function
%pragma put, get, tfunc="Tracer.print"

// use our own Bitstream-compatible class
%pragma bitstream="RTPInput", noput

1.6 Bitstream Syntax Errors

We mentioned earlier that the translator only reports syntax errors. These errors are detected whenever a parsed variable does not have its expected value. Such 'marker' variables are very frequently used in practical representation formats. Here is the description of a file in which each line must begin with the character 'A'.

HelloALine.fl
class HelloALine {
    char(8) id = 'A'; // first char *must* be 'A'
    int i=0;
    do {
        char(8) c[[i]];
    }
    while (c[i++] != '\n');
}

Note that here we have more than one parsable variables; as you would expect, they are parsed (or output) in exactly the same order as they are declared.

The translator will generate code that checks if the value read for id is actually an 'A'. If it is not, it will have to report the error. This is done by calling a function called flerror. This function should accept a variable number of arguments so that the translator can generate rich error message (i.e., declared as: void flerror(char *fmt, ...)). The run-time libary includes a sample implementation that prints the message to stderr and exits. The implementation in the library will be ignored by your linker if you provide your own function. Note that when you output data, the translator will make sure that the id variable is loaded with the value 'A' before output, so you don't have to set it yourself.

1.7 Inheritance

Flavor fully supports inheritance. The keyword 'extends' is used to declare the base class (similarly to Java). Note that only single-inheritance is supported; in general, only features common to both C++ and Java are supported by Flavor in order to allow translation to both. We have yet to find an example where multiple inheritance would be useful for media presentation.

Inheritance is useful when we want to refine existing objects. For example,  we can split our HelloALine class into two parts: our base part will be the initial character, and the derived class will include the remainder of the line. Here is the Flavor description for this structure.

HelloALine Using Inheritance
class CharA {
    char(8) id = 'A';
}

class HelloALine extends CharA {
    int i=0;
    do {
        char(8) c[[i]];
    }
    while (c[i++] != '\n');
}

The base class is parsed before any element of the derived class is parsed. This means that id will be parsed before the c array.

1.8 Parameter Types

In the preceding example that expected value for id in CharA was hard-coded. Depending on the specific representation format, this may not be desirable. Consider, for example, the case where each line of the file starts with the same character as the first.

This is a trivial example of a more fundamental problem, i.e., passing contextual information to an object. As Flavor does not have methods, the mechanism to accomplish this is parameter types. These behave identically to method or function arguments found in C++ and Java, but they are specified as arguments to the class itself.

Here is a modified version of CharA, called FirstChar, that accepts the expected letter value as an argument.

Parameter Types
class FirstChar(char c) {
    char(8) id = c;
}

Whenever you instantiate in Flavor an object that uses parameter types, you must provide actual arguments for all formal arguments. For example:

Instantiating Objects with Parameter Types
class HelloALine {
    FirstChar char('A');
    // etc.
}   

Parameter types can be simple variables, classes, or arrays of these.

We can use the above definition with our inheritance-based HelloALine example, to define a class describing a line that also accepts the expected first letter as a parameter. A possible definition is as follows.

HelloALine Using Parameter Types and Inheritance
class FirstChar(char c) {
    char(8) id = c;
}

class HelloAnyLine(char c) extends FirstCharA {
    int i=0;
    do {
        char(8) c[[i]];
    }
    while (c[i++] != '\n');
}

As we can see, both FirstChar and HelloAnyLine both accept a single parameter as an argument. It would be incorrect to omit the declaration from HelloAnyLine, as it extends FirstChar. A derived class must use the same parameter types as its base class.

Naturally, you also need to provide values for these parameters from your C++ code. This is accomplished by providing additional arguments to the put() and get() methods. For classes that use parameter types, the translator will generate method declarations that include all parameter types as additional arguments to put() and get(), immediately following the Bitstream argument. Here is an example, where we a read a line starting with the letter 'Z'.

Parameter Types and C++
    HelloAnyLine line;
    line.get(bs, 'Z');

1.9 Bitstream Polymorphism

One of the key benefits of object-oriented programming and inheritance is polymorphism: the capability of derived classes to be used in the place of base classes. For the purposes of illustration, let's assume that our media representation consists of lines beginning either with 'A' or with 'B'. We also want a single object HelloLine that is capable of representing either. A potential solution is to use simple containment.

HelloLine Using Containment
class Line {
    int i=0;
    do {
        char(8) c[[i]];
    }
    while (c[i++] != '\n');
}

class HelloALine {
    char(8) id = 'A';
    Line line;
}

class HelloBLine {
    char(8) id = 'B';
    Line line;
}

class HelloLine {
    char(8)* id; // read ahead
    if (id == 'A')
        HelloALine lineA;
    else 
        HelloBLine lineB;
}

Here we define our basic line as Line. We then define classes for lines that begin with an 'A' or 'B' (HelloALine and HelloBLine). Finally, we define a class HelloLine that can contain either type. Observe the '*' notation; this is not pointer declaration (in fact, Flavor does not support pointers or references). The '*' after the parse size declaration means that the bitstream should be examined but not actually read. This is the way to implement look-ahead input in Flavor. Our class, then, looks ahead to see what is the next character; if it is an 'A', then an object of type HelloALine is read, or an object of type HelloBLine otherwise. Both of these objects (and id) will be member variables of the HelloLine class. When accessing such an object, it is up to you to examine with of the two subobjects is valid, by checking the contents of id. For example, if you need to access the third character of a line, you sould write code like the following:

Accessing Contained Members
    // this is in your C++ code
    HelloLine line;
    line.get(bs);
    char c3;
    if (line.id == 'A' )
        c3 = line.lineA.c[2];
    else
        c3 = line.lineB.c[2];

Similarly, if you want to output data, you will have to set the correct value for id so that the put() method can figure out which object it should write:

Accessing Contained Members: Preparing for Output
    // prepare for output
    HelloLine line;
    strcpy(line.lineA.c, "test\n");
    line.id='A'; // you *must* set a value for id
    line.put(bs);

This will output the string 'Atest'.

Although this works, it can get very problematic and defeats the whole purpose of object-oriented programming. That's where polymorphism can provide significant value.

In a traditional programming context, polymorphism is implemented using vtables that handle dispatch of method calls to the right class. In a bitstream context, however, any such information must be present in the bitstream itself. Bitstream objects that can take the place of each other need a mechanism to distinguish which of them is the one actually provided in the bitstream. This gives rise to the concept of object identifiers, or IDs for short.

The identifier is a common variable shared by all classes in the same hierachy. The value of the identifier uniquely determines the actual type of the object that is present in the bitstream (or should be output to a bitstream). This requires that ID values are unique for each class within a given hierarchy.

To signify the special characteristics of IDs, they are declared outside the braces of the class declaration, immediately after the name of the class. They are also the first element that is parsed. They must also be simple variables, and cannot be arrays. Let's convert our previous containment example to use object IDs.

HelloLine Using Bitstream Polymorphism
class Line : char(8) id = 0 {
    int i=0;
    do {
        char(8) c[[i]];
    }
    while (c[i++] != '\n');
}
class HelloALine extends Line
 : char(8) id = 'A' {
    // empty
}
class HelloBLine extends Line
  : char(8) id = 'B' {
    // empty
}
class HelloLine {
    Line line;
}

Observe how the ID is declared before the opening brace and after ':' character.  HelloALine and HelloBLine trivially extend Line to just use a different ID value. Here we called our ID 'id', but any name would do.

Let's follow step-by-step what the generated code will do when the get() method of HelloLine is called. The code will first do a look-ahead to check the value of the ID. Then, depending on its value, it will create an object of the correct type and assign it to line. This means that line will have to be implemented as a pointer. This is the only case where pointer member variables are used in the code generated by the translator. Since the generated code will create the new object, it is important that, if you provide your own constructors (e.g., using verbatim code), you also provide one that can work without any arguments.

After creating the object and assigning it to line, the code will code the get() method of the newly created object. In fact, the code will call the get() method of the line object itself, but both get() and put() are declared as virtual member functions; as a result, the correct method will be called.

Note that the base class, Line, has an ID value of 0. It is important that the line variable corresponds to a polymorphic class, i.e., one with an ID.

Here are some examples to illustrate how to handle polymorphic parsable classes in your C++ code.

Bitstream Polymorphism and C++
    // Example 1
    // read a line from input bitstream 'bsIn',
    // change the 2nd char to 'X',
    // and output to bitstream 'bsOut'
    HelloLine hline;
    hline.get(bsIn);
    hline.line->c[0]='X'
    hline.put(bsOut);

    // Example 2
    // prepare a second line for output
    HelloLine hline2;
    hline2.line = new(HelloALine);
    strcpy(hline2.line->c, "test");
    hline2.put(bsOut);

    // Example 3
    // try to trick the code to think
    // it is dealing with a HelloBLine
    // object
    hline2.id = 'B'
    hline2.put(bsOut); // not what you may expect

Examples 1 and 2 should be self-explanatory. For Example 3, we try to trick the code by modifying the ID value. The put() method code that is called, however, is that of object HelloALine. Before sending the ID variable to the output (or any variable that has an expected value), the code will set it to its correct value. That not only saves you from having to set such variables yourself, but it also guarantees that the state of all variables after put() is called is the one specified by the Flavor code. The end result is that the correct output will be produced.

1.10 Comments

The above covers all issues relating to the interface between Flavor and C++. Note that a number of Flavor's features are not related at all to this interface, and have thus been omitted. This includes several important details, such as scoping rules and maps. Please refer to the Overview Documents or the Flavor Specification for more information.