Module Tables

Description (classes)

Review Status

Reviewed By:
jhorstko
Date Reviewed:
1994/08/30

Prerequisite

Etymology

"Table" is a formal term from relational database theory: 'The organizing principle in a relational database is the TABLE, a rectangular, row/column arrangement of data values.' AIPS++ tables are extensions to traditional tables, but are similar enough that we use the same name. There is also a strong resemblance between the uses of AIPS++ tables, and FITS binary tables, which provides another reason to use "Tables" to describe the AIPS++ data storage mechanism.

Synopsis

Tables are the fundamental storage mechanism for AIPS++. They are based upon the ideas of Allen Farris, as laid out in .

More detailed information can be obtained in the AIPS++ "Database" document, found here.

Traditional relational database tables have two features that decisively limit their applicability to scientific data. First, an item of data in a column of a table must be atomic -- it must have no internal structure. A consequence of this restriction is that relational databases are unable to deal with arrays of data items. Second, an item of data in a column of a table must not have any direct or implied linkages to other items of data or data aggregates. This restriction makes it difficult to model complex relationships between collections of data. While these restrictions may make it easy to define a mathematically complete set of data manipulation operations, they are simply intolerable in a scientific data-handling context. Multi-dimensional arrays are frequently the most natural modes in which to discuss and think about scientific data. In addition, scientific data often requires complex calibration operations that must draw on large bodies of data about equipment and its performance in various states. The restrictions imposed by the relational model make it very difficult to deal with complex problems of this nature.

In response to these limitations, and other needs, AIPS++ tables have the following features:

Table Keywords

The class TableKeywords (description)represents the keywords in a table. It is (indirectly) derived from the standard keyword class KeywordSet. It is possible to attach keywords to the table and to each individual column. These keywords can be defined or removed freely.

Table Description

The table description defines the layout of the columns in the table. Furthermore it is possible to define the initial sets of keywords for the table and columns. A column can have a default value, which can automatically be stored in a cell in the table column, when a row is added to the table.

The public classes to set up a table description are:

  1. TableDesc holds the table description (info)
  2. ColumnDesc holds a generic column description. (info)
  3. ScalarColumnDesc defines a column containing a scalar value.
  4. ArrayColumnDesc defines a column containing an (in)direct array.
  5. SubTableDesc defines a column containing a table.

The header files of the above mentioned classes contain more information on how to use them.

The following example how a table description can be constructed.

    #include <aips/Tables/TableDesc.h>
    #include <aips/Tables/ScaColDesc.h>
    #include <aips/Tables/ArrColDesc.h>
    #include <aips/Tables/SubTabDesc.h>
    #include <aips/ScalarKeySet.h>
    #include <aips/IPosition.h>
    #include <aips/Vector.h>
    
    main()
    {
        // First build the new description of a subtable.
        // Define keyword subkey (integer) having value 10.
        // Define columns ra and dec (double).
        TableDesc subTableDesc("tTableDesc_sub", "1", TableDesc::New);
        subTableDesc.keywordSet().keysInt()("subkey") = 10;
        subTableDesc.addColumn (ScalarColumnDesc<double> ("ra"));
        subTableDesc.addColumn (ScalarColumnDesc<double> ("dec"));
    
        // Now create a new table description
        // Define a comment for the table description.
        // Define some keywords.
        ColumnDesc colDesc1, colDesc2;
        TableDesc td("tTableDesc", "1", TableDesc::New);
        td.comment() = "A test of class TableDesc";
        td.keywordSet().keysfloat()("ra") = 3.14;
        td.keywordSet().keysdouble()("equinox") = 1950;
        td.keywordSet().keysInt()("aa") = 1;
    
        // Define an integer column ab.
        td.addColumn (ScalarColumnDesc<Int> ("ab", "Comment for column ab"));
    
        // Add a scalar integer column ac, define keywords for it
        // and define a default value 0.
        // Overwrite the value of keyword unit.
        ScalarColumnDesc<Int> acColumn("ac");
        acColumn.keywordSet().keysComplex()("scale") = 0;
        acColumn.keywordSet().keysString()("unit") = "";
        acColumn.setDefault (0);
        td.addColumn (acColumn);
        td["ac"].keywordSet().keysString()("unit") = "DEG";
    
        // Add a scalar string column ad and define its comment string.
        td.addColumn (ScalarColumnDesc<String> ("ad","comment for ad"));
    
        // Now define array columns.
        // This one is indirect and has no dimensionality mentioned yet.
        td.addColumn (ArrayColumnDesc<Complex> ("Arr1","comment for Arr1"));
        // This one is indirect and has 3-dim arrays.
        td.addColumn (ArrayColumnDesc<Int> ("A2r1","comment for Arr1",3));
        // This one is direct and has 2-dim arrays with axes length 4 and 7.
        td.addColumn (ArrayColumnDesc<uInt> ("Arr3","comment for Arr1",
           				    IPosition(2,4,7),
    					    ColumnDesc::Direct));
    
        // Add columns containing tables.
        // This is done in 3 slighty different ways, which all have
        // their own (dis)advantages.
        // See SubTabDesc.h for a description of the SubTableDesc constructors.
        td.addColumn (SubTableDesc("sub1", "subtable by name","tTableDesc_sub"));
        td.addColumn (SubTableDesc("sub2", "subtable copy",    subTableDesc));
        td.addColumn (SubTableDesc("sub3", "subtable pointer", &subTableDesc));
    }
    

This example only shows the basic table description functionality. More specialized things, like defining a default data manager are not shown.

Creating Tables

Once a table description has been created, it can be used to generate a table from it. This is a multi-step process:
  1. Create an object SetupNewTable with the name of the new table.
  2. Define data managers as needed. If this is not done, data managers will be defined as needed using the default data manager name and group in the column descriptions.
  3. Bind columns to the appropriate data manager (i.e. a storage manager or a virtual column engine).
  4. Define the shape of direct columns (if not already defined in the column description).
  5. Create the Table object from the SetupNewTable object. This final step will perform a check and will create the files as needed.

Not all data managers support all the table functionality. For example, deleting a column or row is not supported in the Karma storage manager. Currently no virtual column engines are implemented yet. Two storage managers are supported:

  1. StManAipsIO uses AipsIO to store the data in the columns. It supports are table functionality, but its IO is probably not as efficient as other storage managers. It also requires that a large part of the table fits in memory.
  2. StManKarma uses Karma to store the data in the columns. (Karma is a software package developed by Richard Gooch at CSIRO). Currently Karma can only handle scalars and direct arrays. Addition and deletion of rows and columns is not supported. The table size (i.e. the number of rows) has to be defined when the Table object is created from the SetupNewTable object.
It should be clear that the choice of data manager greatly influences the type of operations you can do on a table. For example, if only one column uses the Karma storage manager, it is not possible to add or delete rows. However, adding a column is still possible by using an AipsIO storage manager for the column to be added.

The following example shows how a table can be created. The next section discusses how to access the table.

    #include <aips/Tables/TableDesc.h>
    #include <aips/Tables/SetupNewTab.h>
    #include <aips/Tables/Table.h>
    #include <aips/Tables/ScaColDesc.h>
    #include <aips/Tables/ArrColDesc.h>
    #include <aips/Tables/StManAipsIO.h>
    #include <aips/Tables/StManKarma.h>
    
    main()
    {
        // First build the table description.
        TableDesc td("tTableDesc", "1", TableDesc::Scratch);
        td.comment() = "A test of class SetupNewTable";
        td.addColumn (ScalarColumnDesc<Int> ("ab" ,"Comment for column ab"));
        td.addColumn (ScalarColumnDesc<Int> ("ac"));
        td.addColumn (ScalarColumnDesc<uInt> ("ad","comment for ad"));
        td.addColumn (ScalarColumnDesc<float> ("ae"));
        td.addColumn (ArrayColumnDesc<float> ("arr1",3,ColumnDesc::Direct));
        td.addColumn (ArrayColumnDesc<float> ("arr2",0));
        td.addColumn (ArrayColumnDesc<float> ("arr3",0,ColumnDesc::Direct));
    
        // Setup a new table from the description.
        SetupNewTable newtab("newtab.data", td, Table::New);
        // Create storage managers for it.
        StManAipsIO stmanAipsIO_1 ();
        StManAipsIO stmanAipsIO_2 ();
        StManKarma  stmanKarma    ();
    
        // Start with binding all columns to the first storage manager.
        newtab.bindAll (stmanAipsIO_1);
        // Bind a few columns to another storage manager.
        newtab.bindColumn ("ab", stmanAipsIO_2);
        newtab.bindColumn ("ae", stmanKarma);
        newtab.bindColumn ("arr3", stmanKarma);
    
        // Define the shape of the direct columns.
        // (this could have been defined in the column description).
        newtab.setShapeColumn( "arr1", IPosition(3,2,3,4));
        newtab.setShapeColumn( "arr3", IPosition(3,3,4,5));
    
        // Finally create the table consisting of 10 rows.
        // Defining the number of rows is necessary, because the
        // Karma storage manager is used.
        Table tab(newtab, 10);
    
        // Now we can fill the table, which is shown in a next section.
        // The Table destructor will flush the table to the files.
    }
    

Opening an Existing Table

An existing table can be opened by creating a Table for it. The constructor option determines if the table is opened as readonly or as read/write. A readonly table file must be opened as readonly, otherwise an exception is thrown.

The functions isWritable can be used to determine if a table is writable.

When the table is opened, the data managers will be reinstantiated as defined when the table was created.

Opening an existing table is straightforward and looks like:

     Table table ("tableName");                     // readonly
or
     Table table ("tableName", Table::Update);      // read/write

Writing into a Table

Once a table has been created or has been opened for read/write, data needs to be written into the table. Before that can be done, it may be needed to add one or more rows to the table.
Tip If a table is created with a given number of rows, it is not needed to add rows.
When rows are added to the table (either via the Table constructor or via the addRow function), it is possible to define if the rows need to be initialized. Initializing means that the default value as defined in the description gets written into all cells of the added rows.

The actual writing of data must be done using the classes ScalarColumn and ArrayColumn. For each column one or more of these objects can be constructed. The put functions in these classes allow to write a value at a time or the entire column in one go. For arrays it is possible to put subsections of the arrays.

A typical program could look like:

    #include <aips/Tables/TableDesc.h>
    #include <aips/Tables/SetupNewTab.h>
    #include <aips/Tables/Table.h>
    #include <aips/Tables/ScaColDesc.h>
    #include <aips/Tables/ArrColDesc.h>
    #include <aips/Tables/ScalarColumn.h>
    #include <aips/Tables/ArrayColumn.h>
    #include <aips/Vector.h>
    #include <aips/Lattice/Slicer.h>
    #include <aips/ArrayMath.h>
    #include <iostream.h>
    
    main()
    {
        // First build the table description.
        TableDesc td("tTableDesc", "1", TableDesc::Scratch);
        td.comment() = "A test of class SetupNewTable";
        td.addColumn (ScalarColumnDesc<Int> ("ac"));
        td.addColumn (ArrayColumnDesc<float> ("arr2",0));
    
        // Setup a new table from the description.
        // Create the (still empty) table.
        // Note that since no explicit binding of columns to data managers
        // has been done, implicit binding to the default AipsIO storage
        // manager will be done.
        SetupNewTable newtab("newtab.data", td, Table::New);
        Table tab(newtab);
    
        // Now construct the various column objects.
        // Their data type has to match the data type as in the description.
        ScalarColumn<Int> ac (tab, "ac");
        ArrayColumn<float> arr2 (tab, "arr2");
        Vector<float> vec2(100);
    
        // Now write the data into the columns.
        // In each cell arr2 will be a vector of length 100.
        // Since its shape is not explicitly set, it is set implicitly.
        for (uInt i=0; i<10; i++) {
            tab.addRow();               // First add a row.
            ac.put (i, i+10);           // value is i+10 in row i
            indgen (vec2, float(i+20)); // vec2 gets i+20, i+21, ..., i+119
            arr2.put (i, vec2); 
        }
    
        // Now show the entire column ac.
        // Show the 10th element of arr2.
        cout << ac.getColumn();
        cout << arr2.getColumn (Slicer(Slice(10)));
    
        // The Table destructor writes the table.
    }
    

In this example rows are added in the for loop. It would have also been possible to immediately create 10 rows by constructing the Table object as:

     Table tab(newtab, 10);
In that case the function call
     tab.addRow()
should not be done.

In the classes TableColumn, ScalarColumn and ArrayColumn are several functions which can be used to put a value into a cell of the column or to put all values in the column. At first this may look overwhelming, but it is quite simple. The functions can be divided in a two cases:

  1. Put the given value into the column cell(s).
    • The simplest put function in Scalar/ArrayColumn puts a value into the given column cell. For convenience, there is a putSlice function in ArrayColumn which puts only a part of the array.
    • The fillColumn function in Scalar/ArrayColumn fills an entire column by putting the given value into all column cells.
    • The simplest putColumn function in Scalar/ArrayColumn puts an array of values into the column. For convenience, there is a putColumn function in ArrayColumn which puts only a part of the arrays.
  2. The other put and putColumn functions copy values from another column to this column. These functions have the advantage that the data type of the input and/or output column can be unknown. The generic (RO)TableColumn objects can be used for this purpose. The put(Column) function takes care of checking and, if possible, converting the data types. If the conversion is not possible, an exception will be thrown.
    • The put functions copy the value in a cell of the input column to a cell in the output column. The row numbers of the cells in the columns can be different.
    • The putColumn functions copy the entire contents of the input column to the output column. The lengths of the columns must be equal.
    Each class has its own set of these functions.
    • TableColumn has the most generic put/putColumn functions. They can be used if the data types of both input and output column are unknown. Note that these functions are virtual.
    • The put/putColumn functions in ScalarColumn and ArrayColumn are less generic and therefore potentially more efficient. The most efficient variants are the ones taking a ROScalar/ ArrayColumn<T>, because they require no data type conversion.

Reading from a Table

Reading from a column in a table can be done using the get functions in the classes ROScalarColumn<T> and ROArrayColumn. Apart from these functions, ROTableColumn::getScalar and ROTableColumn:asXXX can be used to get scalars of a standard data type (i.e. Bool, uChar, Int, uInt, float, double, Complex, DComplex and String). These functions also offer the possibility of data type promotion, thus getting a value in a float column as a double, for example.

The way these functions are used is the same as the simple put functions described in the previous section.

ScalarColumn is derived from ROScalarColumn, thus these get functions are also available in ScalarColumn. However, if ScalarColumn object is constructed for a non-writable column, an exception is thrown. Only ROScalarColumn objects can be constructed for nonwritable columns. The same thing is true for ArrayColumn and TableColumn.

Table Selecting and Sorting

The result of a select and sort of a table is another table, which references the original table. This means that an update of a sorted or selected table results in the update of the original table.

It is possible to select rows or columns from a table. Columns can be selected by the Table::project function. Rows can be selected by the various Table function operators. The main way to do a row select is by giving a select expression using TableExprNode objects. These objects represent the various node in an expression (e.g. a constant, column, subexpression). The Table function col creates a TableExprNode object for a column. The function key does the same for a keyword by reading the keyword value and storing it as a constant in an expression node. All column nodes in an expression must belong to the same table, otherwise an exception is thrown. The following example selects all rows with RA>10

    #include <aips/Tables/ExprNode.h>
        ...
    Table result = table.select (table.col("RA") > 10);
while the following example selects between the given RA and DEC values.
    Table result = table.select (table.col("RA") > 10
                              && table.col("RA") < 14
                              && table.col("DEC") >= -10
                              && table.col("DEC") <= 10);
The following operators can be used to form an arbitrarily complex expression: In the near future functions like sin will also be possible.

Table Vectors

Table Iterators


Classes

ArrayColumn -- Read/write access to an array table column with arbitrary data type (full description)
ArrayColumnDesc -- Templated class for description of table array columns (full description)
ColumnDesc -- Envelope class for the description of a table column (full description)
ROArrayColumn -- Readonly access to an array table column with arbitrary data type (full description)
ROScalarColumn -- Readonly access to a scalar table column with arbitrary data type (full description)
ROTableColumn -- Readonly access to a table column (full description)
ScalarColumn -- Read/write access to a scalar table column with arbitrary data type (full description)
ScalarColumnDesc -- Templated class to define columns of scalars in tables (full description)
StManAipsIO -- AipsIO table storage manager class (full description)
StManColumnAipsIO -- AipsIO table column storage manager class (full description)
StManColumnKarma -- Karma table column storage manager class (full description)
StManKarma -- Karma table storage manager class (full description)
SubTableDesc -- Description of columns containing tables (full description)
Table -- Main interface class to a read/write table (full description)
TableColumn -- Non-const access to a table column (full description)
TableDesc -- specify the structure of an AIPS++ table (full description)
TableKeywords -- Keyword values representing tables (full description)

Copyright © 1995 Associated Universities Inc., Washington, D.C.