Intercalate Tutorial |
Intercalate is a programmer's tool which helps unify, manage, and maintain the wide variety of representations which domain information may assume over the lifecycle of a development process. Intercalate promotes structured analysis of an information domain, and enhances reuse of models and views that have been developed to express the information. This tutorial provides an introduction to the Intercalate approach for data modeling.
Domain Objects + Views = Representations Gene SQL Target O/R Mapping Sequence Java Molecule HTML Order Scientist etc.
The goal of object-oriented analysis is to identify the actors in a domain and describe their behaviors and interactions. The result of this analysis is a model, which captures and describes in an abstract form the essential elements of the domain.
An object model is useful by itself to understand a domain. More often it is used to represent and simulate the domain as a computer program. The information in an object is not generally confined to just one representation, but takes on many forms according to how it is being used. An application might, for example, use SQL to define database tables in which to store the object's static, persistent data, use Java or another programming language to code the dynamic, computational aspects, use a proprietary object/relational mapping data structure to translate between static and dynamic representations, and use HTML to display the information in a web browser.
These many uses of many objects lead to a combinatorial explosion of representations. It is a challenge to manage and synchronize change in such a system. Intercalate addresses that challenge by providing a format to represent the model objects and views, and tools to construct and use them to automatically generate the ultimate representations.
Modeling | | Protoype | to |
Property List |
Production | |
Property List |
to | Result |
Intercalate tools can be used in two ways. Intercalate can build a model of an information domain by analyzing an example or prototype of some desired output text, separating the text into a property list and template. The other more frequent use of Intercalate is to generate one or more output files by filtering a model, represented by the property list, through a view, represented by a template.
Property lists are collections of key-value pairs, where the values may be simple strings or nested collections. There are two types of collection, a dictionary and an array. A dictionary represents the specialization of a property, while an array collects together instances of domain objects with comparable properties. The structure of a property list will become clearer as you proceed through this tutorial.
A template is composed of text interleaved with keywords. Text is copied to the output verbatim, while keywords direct Intercalate to look up values in the property list or perform other processing actions.
Intercalate provides many benefits for the developer and system maintainer. It provides a structured representation of domain objects and powerful expressions for constructing templates or views of those objects. It ensures that changes to objects or views take effect in all their representations, and reduces the effort to maintain and reuse objects.
$key$
key = "value";
value
Each key in the template is looked up in the property list. The value found there replaces the template key in the generated output. Other text in the template is written to the output verbatim.
For Iterate over a list If Test condition Unless Test negated condition Write Redirect output Include Redirect input
A small but general set of keywords gives us a vast amount of control over the form and function of the output.
Intercalate maintains synchronization, so that a change to a domain object immediately takes effect across all its representations. Similarly, changes to templates are automatically applied to representations for all domain objects. By not having to make changes in the final representations, we can ensure that the changes are applied uniformly, eliminating inadvertent editing errors and inconsistencies between the functionality for different objects.
Intercalate enhances the maintainability of a system. Changes to domain objects or templates are automatically propagated to all the relevant representations.
Intercalate enhances object reuse. If a new domain object is added, it immediately becomes functional in all the contexts where the objects are used. Similarly, by simply writing a template for a new context, all objects in the domain can immediately be used in that context.
A command line version of Intercalate is available for download. It may be used freely without restrictions. A full featured version of Intercalate including a GUI Intercalate Generator, Template Editor, and Property List Editor with XML support is available for purchase. Set JAVA_HOME to your JDK or the java command in your path. Set the bin directory in your path. To run Intercalate issue the following command.
Prompt> intercalate <template file> <property list file>
Next we will be describing the process of building an Intercalate property list and template from a prototype of the desired output. It starts by making the output text be the initial template. Next we will identify pieces that are properly part of the object rather than the view and move those to the property list, leaving behind keys as placeholders in the template. We will add control keywords to repeat portions of the text, and to produce output only if a condition is true. Finally we will look at directives to control the input and output of the generated text.
Prototype create table Person ( Name varchar(100), ID number ); alter table Person add primary key (ID);
The Intercalate analysis process starts with a prototype of the text we want the template and property list to represent and to generate. For this tutorial we will use as an example a snippet of SQL, which creates a table called Person having two columns, Name and ID. The Name column can hold up to 100 alphanumeric characters. The ID column is strictly numeric and serves as a primary key.
Much of the text in this prototype would also be present in the SQL used to create any other table. To take advantage of this, we use the prototype as the first step in the evolution of a general template. The steps in the Intercalate modeling process are to incrementally identify words specific to this table and move them to the property list, leaving a keyword behind in the template to show where they should go. The table name, Person, highlighted here in red, is the first and most obvious example of a feature that would be specific to the Person table.
Property List table = "Person";
A property list is a table of keys and values. The keys are given names, which identify the role the values take on in the model. The table is organized this way to illustrate and clarify the relationships between values. Here, we add a key called table to the property list because it represents the most fundamental attribute of the database table, its name. The value of the key is the actual name of the table, Person.
Template create table $table$ ( Name varchar(100), ID number ); alter table $table$ add primary key (ID);
Back in the SQL prototype, we replace the value we just identified, Person, with its key in the property list. The presence of a key turns the prototype into a simple template. If this template were used with Intercalate to filter the property list we just created, the result would be a file identical to the original prototype.
I would like to digress a moment and draw your attention to the notation used here to represent a key. An Intercalate template is an ASCII file, which contains no formatting or font information. So in order to reduce confusion the keys in a template are actually enclosed in dollar signs, a character less often used. Where a dollar sign must appear in the output text, it is escaped by doubling it in the template as $$.
Template create table $table$ ( Name varchar(100), ID number ); alter table $table$ add primary key (ID);
We proceed to identify the next piece of information specific to this table that we wouldn't expect to occur universally in other table creation scripts. The fact that the Person table has a column called Name is clearly not a general characteristic of all tables. The Name column similarly has characteristics, specifically the data type and size information, that are specific to that column. We extract all these pieces of information and give them each individual keys in the property list.
Property List table = "Person"; column = "Name"; type = "varchar"; size = "100";
We add three new keys and values to the property list. The column key represents the column name, Name, the type key represents the data type, char, and the size key represents the storage capacity, 100.
Template create table $table$ ( $column$ $type$($size$), ID number ); alter table $table$ add primary key (ID);
Back in the template, we replace the values we just put in the property list with the corresponding keys.
Template create table $table$ ( $column$ $type$($size$), ID number ); alter table $table$ add primary key (ID);
Now we turn to the second column of the table. This column has a name, ID, and a data type, number, but no explicit capacity is specified.
Property List table = "Person"; columns = ( { column = "Name"; type = "varchar"; size = "100"; }, { column = "ID"; type = "number"; } );
Because there is more than one column in the table, we denote this by collecting the columns into a list, which is delimited by parentheses. This entire list is given the name columns. The entries in the list are the column dictionaries, or collections of attributes specific to each column.
A distinction is made between these two types of collection: lists and dictionaries. A list is an ordered array of elements, while a dictionary is an unordered collection of unique keys with values. A value may be a simple string, or another dictionary or array. Most of the values in this example are strings. The value of the columns key is an array. Later we will see uses for values that are dictionaries.
We add the dictionary of keys and values representing the second column as the second element of the columns list.
Template create table $table$ ( $for columns$ $column$ $type$($size$), $end$ ); alter table $table$ add primary key (ID);
To work with lists, we introduce a new kind of directive into the template. The for directive repeatedly generates a section of text using keys from each element of a list in succession. In this case, the for directive generates for each element of the columns list a line of text containing column name, type, and size. The end directive identifies the end of this repeating section of text.
Results create table Person ( Name varchar(100), ID number(), ); alter table Person add primary key (ID);
Let's look at the result the template and property list would generate at this stage of development, if run through the Intercalate engine. There are two things wrong with it.
Template create table $table$ ( $for columns$ $column$ $type$ $if size$($size$)$end$ $unless _last$,$end$ $end$ ); alter table $table$ add primary key (ID);
We solve these problems using conditional directives. There are two kinds of conditional directive. The if directive only generates text in its block if the specified condition is true, and the unless directive only generates text if the condition is false.
In this example, $if size$ tests whether a size keyword exists in the property list of the current element. If so, the size, enclosed in parentheses, is inserted in the output. If not, that section of text is omitted.
Furthermore, $unless _last$tests whether the special _last key exists, and if not generates a comma. The _last key is automatically and transparently associated with the last element of any list. There is also a _first key associated with the first element, and an _index key associated with every element. The value of _index is the element's position in the list, starting with number 1. The _last and _first keys have no associated values.
Review what we've learned so far.
Template create table $table$ ( $for columns$ $column$ $type$ $if size$($size$)$end$ $unless _last$,$end$ $end$ ); alter table $table$ add primary key (ID);
What do we do about the column used as the primary key? Again, simply replacing that value with the column keyword would be ambiguous.
Property List table = "Person"; columns = ( { column = "Name"; type = "char"; size = "100"; }, { column = "ID"; type = "number"; primary_key = ""; } );
The solution is to add a key with a distinctive and descriptive name, primary_key, to the property list entry for the ID column. The value of that key is not important, so we leave it empty.
Template create table $table$ ( ); alter table $table$ add primary key ($for columns(select primary_key)$ $column$ $unless _last$, $end$ $end$);
Back in the template, we again use a for directive to select each column in the columns list in turn, but we add a qualifying select phrase to the directive which states that we are only interested in the elements which contain a primary_key key. The name of each column so identified is generated, followed by a comma unless it is the last element.
Note that qualifying the elements with a select phrase in the expression $for columns (select primary_key)$ $end$ is not quite the same as saying $for columns$$if primary_key$ $end$$end$. In the latter case the last primary key element may not be the last column in the list, and will not have _last key set.
Template create table $table$ ( ); $if columns.primary_key$ alter table $table$ add primary key ($for columns(select primary_key)$ $column$ $unless _last$, $end$ $end$); $end$
One final adjustment is needed. If no column is identified as being the primary key, we don't want to generate any part of the alter table command. To accomplish this we enclose the entire block of text in an if condition which tests if any column has a primary_key attribute.
Note the new syntax, which concatenates columns and primary_key with a dot. This says to look in the elements of the columns list for an item containing a primary_key key. The value the dotted key represents is the value of the first such key it encounters. The condition fails if there is none.
In the same manner, references may be made to elements inside nested dictionaries. Additional keys may be appended to reach any depth of nesting desired.
Review what we've learned (part 2).
Prototype public class People { private String Name; private Integer ID; public String getName() { return Name; } public Integer getID() { return ID; } }
Now let us turn from SQL and look at a Java class that might represent the dynamic aspect of the persistent information stored in the database table. Note that the Java class has many of the same features as the SQL example. The table name appears as the class name, and the column names appear in instance variables and accessor method signatures. The Java data types String and Integer aren't the same as their SQL counterparts, so we will give them a new key, javatype.
Template public class $table$ { $for columns$ private $javatype$ $column$; $end$ $for columns$ public $javatype$ get$column$() { return $column$; } $end$ }
Template construction for Java follows much the same pattern as with the SQL example. The class name is replaced by the table key. The instance variable names are replaced with column keys, and text is repeated for each entry in the columns list.
We've introduced a new key, javatype, for the Java data type. We could add this key with the corresponding value to each entry in the columns list, but since the Java type and the SQL type are systematically related, there is a better way.
Propety List javatype = "$sql2java.type$"; sql2java = { varchar = "String"; number = "Integer"; }; table = "People"; columns = ( );
Instead of inserting a javatype key and value in the property list for each column, at the top level of the property list we create a lookup table called sql2java, which contains one key for each SQL data type with its corresponding Java data type as the value. Here, an SQL varchar will become a Java String, and an SQL number will become a Java Integer.
Then we add a javatype key with a value that is itself a key. This is a new concept. In previous examples, key values were text to be replaced in the template, or lists of objects. When the value of a key is itself another key, that key is in turn evaluated to determine what value to generate.
Here the key is a compound one which says to look for the type key in the sql2java dictionary. Our first observation is that there is in fact no type key in the sql2java dictionary. This illustrates another feature of how Intercalate looks up keys in the dictionary. If a particular key is not found, the last component of the key is looked up independently, and the value so discovered used in its place. So, for example, in the Name column, the value of type is varchar. Varchar is then used in place of type, and lookup for the key sql2java.varchar succeeds, returning the value String.
Property List javatype = "$sql2java.type$"; table = "People"; columns = ( ... { column = "ID"; type = "number"; primary_key = ""; javatype = "BigInteger"; } );
Similarly, for the ID column, one would look up the type key, find its value to be number, look up the sql2java.number key and return the value Integer.
Suppose in general we want SQL numbers to be represented as Java Integer values, but that the numbers in the ID column are too large and must instead be represented as a Java BigInteger. We can insert a javatype key with a BigInteger value in the dictionary for the ID column. The value of this specific key is then used instead of looking up the value for the javatype key at the top level.
In fact, this specific-to-general search behavior takes place for every key, not just those used for lookup tables. Whenever there exists a series of nested structures, if a key is not found in at the current level of nesting, the search is repeated at the level above and so on until the top level is reached. Only if all such searches fail will the key lookup fail and return an empty string as the value.
This search mechanism is an excellent way to represent default values. If all columns in a table are numeric, for example, one might place a type key with a number value at the table level and omit the key entirely from the individual column entries.
Review what we've learned (part 3).
Property List tables = ( { table = "Person"; }, { table = "Car"; columns = ( { column = "VIN"; }, { column = "make"; }, { column = "owner_id"; } ); } );
Now let's extend the property list to add another table. This one holds information about cars. We create a list of tables with the People property list as its first element, and add a second property list with the table name and columns needed to store information about a car.
Template $for tables$ create table $table$ ( ); $if columns.primary_key$ alter table $table$ add primary key ( ); $end$ $end$
The template to generate SQL is easily modified to iterate over the list of tables. For each entry in the list, this template generates the SQL to create the table and identify its primary key column.
Template $for tables$ public class $table$ { } $end$
The template for generating Java classes can be similarly modified to iterate over the tables in the list.
Template $for tables$ $write table.java$ public class $table$ { } $end$ $write$
When generating SQL, it is often desirable to have the script to create an entire schema in a single file, both to maintain consistency and to make the handling and execution simpler. When generating Java, on the other hand, it is generally the case that you would like each class to be placed in an individual file, named the same as the class it represents. The write keyword allows you to do just that.
By default, Intercalate sends the text generated by merging a template and property list to the standard output stream. This conveniently permits it to work with the output redirection available in Unix and Windows. Whenever Intercalate encounters a write keyword in a template, it constructs a file name from the following key and thereafter sends output to that file. If the write keyword appears without a following key, subsequent text once more is sent to the standard output stream.
The key used for generating file names is handled slightly differently from other keys. In our example, since the value of the table key is a simple string and not a structure, ordinarily the compound key table.java would fail lookup and return an empty string. When used with the write keyword, however, the file name generated is the value of the last successful lookup concatenated with the rest of the key. In our example, the file names will be People.java and Cars.java.
Template $for tables$ $write table.java$ public class $table$ { $include table.extras$ } $end$ $write
The include directive can be thought of as the converse of the write directive. Rather than redirecting the output, it tells Intercalate to pause in the processing of the current template to work with the template named by the specified key. A file name is constructed from the key just as it is with the write directive. When the included template has been completely processed, Intercalate resumes processing of the current template.
The include directive is a very powerful mechanism for customizing output. In this example, although the basic template produces instance variable declarations and accessor methods in a uniform manner, any complex functionality, which distinguishes one Java class from another, can be provided in its own template.
If the included file doesn't exist or for some other reason cannot be opened, Intercalate ignores the directive and continues processing the current template without interruption.
I will end by reiterating the benefits that Intercalate provides the developer and system maintainer.
HomeIntercalate provides a structured representation of domain objects and powerful expressions for constructing templates or views of those objects. It ensures that changes to objects or views take effect in all their representations, and reduces the effort to maintain and reuse objects.