Universal object compiler using BNF rules II
BNFUP is a class library that implements an object compiler from the definition of a language using BNF rules. It also provides rule editing services. In this article I continue showing how to use the editor to compile and test your own objects using the language you have defined for that. I will also show you three examples of implementation.
In this link you can read the first article of the series about the syntax definition and rule editor.
In this another link you can download the source code of the BNFUP project, with the class library that implements the compiler, the BNF rules editor and three examples of class libraries that implement objects that can be compiled from three different languages. It is written in CSharp using Visual Studio 2015.
Compiling and testing objects with BNFUPEditor
To show how to use the built-in compiler in the BNF rule editor, let's use the first example. This is a reduced version of the HTML table definition, which can be found in the htmltable.bnf file in the Samples subdirectory of the BNFUP solution.
The first button in the toolbar is used to select the class library with which we are going to generate the objects. In this case it is CompilableDataTable.dll, which is located in the CompilableDataTable\bin\Debug subdirectory of the BNFUP solution.
Once the class library is selected, the second button on the toolbar will be activated, with which you can access the object compiler. The last button on the toolbar allows you to configure the compiler to be or not case-sensitive.
You can write the source code or load it from a text file using the first button in the toolbar. In the Samples subdirectory you can find the file tables.txt with an example. With the second button you can save the source code to a file.
Once you have written the code you want to compile, you will use the third button in the toolbar to start the compilation. If there are no syntax errors, the two remaining buttons on the toolbar will be activated. The first one is to test the generated object, while the second one, which will also be activated in case of failure, will display the messages generated during the compilation.
The result in this case is the compiled table shown in a DataGridView control:
Regarding the compilation messages, there are three points in which they are generated. The first is when the object is created, the second when it is completed, because the parsing of the element that generated it has finished, and is added to the previous object in the hierarchy. Messages are also produced if the object is discarded, when an optional element that generates objects is processed and the analysis fails.
The data that is displayed in the messages about the generated objects are obtained by the ToString method of each object. Here's what the message list looks like:
Regarding the source code of this class library, the ICompilableObjectFactory object is implemented in the DataTableFactory class, this is the CreateObject method with which objects are created:
public ICompilableObject CreateObject(string ctype, IRuleItem item)
{
switch (ctype)
{
case "identifier":
return new Identifier();
case "emptyid":
return new EmptyIdentifier();
case "idlist":
return new StringDataList();
case "rowlist":
return new BodyRowCollection();
case "table":
_data = new CompilableDataTable();
return _data;
default:
return null;
}
}
With the table identifier thee root object, CompilableDataTable, is created. The creation of this object is assigned to the rule <<table>>, which is the main rule of the language, so this is the first object to be created and the rest will be added to this object as they are created and completed, using the AddItem method:
public bool AddItem(ICompilableObject item)
{
StringDataList sl = item as StringDataList;
if ((sl != null) && (_table.Columns.Count == 0))
{
foreach (string s in sl)
{
_table.Columns.Add(new DataColumn(s));
}
return true;
}
BodyRowCollection bc = item as BodyRowCollection;
if (bc != null)
{
foreach (StringDataList l in bc)
{
DataRow row = _table.NewRow();
for (int ix = 0; ix <
Math.Min(_table.Columns.Count, l.Count); ix++)
{
row[ix] = l[ix];
}
_table.Rows.Add(row);
}
return true;
}
return false;
}
This method must return true if the added object is accepted or false otherwise, which will result in a compilation error.
The table is composed of a header, with the names of the columns, which is obtained from a StringDataList object, which is a list of text strings, and a series of data rows that is obtained from a DataRowCollection object, which is a collection of StringDataList objects. With these objects, a DataTable object is constructed and will be displayed in a DataGridView control when the Test method is called.
The StringDataList object is created using the idlist identifier. In the definition of the language I have marked the <header>, <headerlist>, <datalist> and <row> rules with this identifier to generate this type of objects. The AddItem method with which these lists are constructed is the following:
public bool AddItem(ICompilableObject item)
{
Identifier id = item as Identifier;
if (id != null)
{
Add(id.Text);
return true;
}
StringDataList dl = item as StringDataList;
if (dl != null)
{
AddRange(dl);
return true;
}
return false;
}
As you can see, this object can be composed by adding Identifier objects or another StringDataList object, from which all its content is taken.
The Identifier objects are created using the identifier identifier, which is assigned to the <identifier> and <data> rules. The <identifier> rule is used to generate column names, as they must begin with a letter. With the <data> rule you create the contents of the rows of data, which can start with any character.
Finally, the table row collection is implemented in the BodyRowCollection class, whose AddItem construct method is as follows:
public bool AddItem(ICompilableObject item)
{
StringDataList sl = item as StringDataList;
if (sl != null)
{
Add(sl);
return true;
}
BodyRowCollection bc = item as BodyRowCollection;
if (bc != null)
{
AddRange(bc);
return true;
}
return false;
}
The object accepts other objects of type StringDataList or another BodyRowCollection, from which it takes its contents. The reason that these objects are constructed using objects of the same type is that the rules are defined recursively, and this way, at the end, there is only a single object of this type to add to its main container, which is the object CompilableDataTable.
To create BodyRowCollection objects the rowlist identifier is used, which is assigned to the <body> rule.
Arithmetic expressions generator
The second example is a language that generates arithmetic expressions. These expressions can contain numbers, constants and up to three variables, which are identified by the letters x, y or z. You can use the most common operators, addition (+), subtraction or minus (-), multiplication (*), division (/) and exponentiation (^). You can also use subexpressions enclosed in parentheses.
The language is defined in the expressions.bnf file, in the Samples subdirectory of the solution directory. The implementation can be found in the Expressions/bin/Debug subdirectory. The project with the corresponding source code is Expressions.
The ExpressionFactory class implements the required ICompilableObjectFactory object for compilation:
public ICompilableObject CreateObject(string ctype, IRuleItem item)
{
switch (ctype)
{
case "number":
return new Number(_exproot);
case "constant":
return new Constant(_exproot);
case "variable":
return new Variable(_exproot);
case "operator":
return new Operator();
case "expression":
Expression ex = new Expression(_exproot);
if (_exproot == null)
{
_exproot = ex;
}
return ex;
case "p-expression":
Expression pex = new Expression(_exproot);
pex.Parenthesis = true;
if (_exproot == null)
{
_exproot = pex;
}
return pex;
default:
return null;
}
}
The different types of objects that can be created are the numbers, Number, with the number identifier, the constants, Constant, with the constant identifier, the variables, Variable, with the variable identifier, operators, Operator, with the operator identifier and the expressions, Expression, with the expression and p-expression identifiers. The latter is used to mark an expression as in parentheses. All of these classes, except Operator, are derived from a common base class, ExpressionBase.
The main object is Expression, which can be composed of one or two arguments, which can be expressions, numbers, constants or variables and by an operator. The only type of object that accepts components using the AddItem method is the Expression class. In this class I have also implemented the Simplify method, from the ICompilableObject interface, which allows reducing an expression that only contains a number, variable or constant to one of these objects, so that the resulting object contains as few subexpressions as possible.
public override ICompilableObject Simplify()
{
_exp1 = _exp1.Simplify() as ExpressionBase;
if ((_exp2 == null) && (_op == null))
{
Expression ex1 = _exp1 as Expression;
if (ex1 != null)
{
_exp2 = ex1._exp2;
_op = ex1._op;
_exp1 = ex1._exp1;
Parenthesis = Parenthesis || ex1.Parenthesis;
}
else
{
return _exp1;
}
}
else if (_exp2 != null)
{
_exp2 = _exp2.Simplify() as ExpressionBase;
}
GroupLeft();
return this;
}
The variables _exp1 and _exp2 contain the arguments of the expression, while _op contains the operator. If no operator or second argument exists and the first argument is not an expression, this element is returned as a result. If the first argument is an expression, its arguments and its operator are taken so that the number of nested subexpressions is reduced.
When calling the GroupLeft method, the subexpressions are grouped so that they are evaluated from left to right, because, due to language recursion, they are constructed in a way that inverts the order of evaluation, which would produce errors in the result.
If you compile and test any expression, you get a result like this:
In the toolbar you can give value to variables and constants, while in the hierarchical list on the right side you can select different subexpressions to see the total or partial results.
Respecting the language, all token representing operators generate objects of type Operator, the rule <number> generates Number objects, <var> generates Variable objects, <const> generates Constant objects and the rules <<expr>>, <expr2>, <expr1>, and <expr0> generate Expression objects. The <pexpr> rule also generates Expression objects, but marked to indicate that it is an expression in parentheses.
Function drawing
The last example that I am going to show is built using the previous one. It is the same expression language above, to which I have added some rules to define a range of values for a variable and values of constants, so that the function indicated by an arithmetic expression can be represented graphically.
The file with the language definition is graphic.bnf, which is located in the Samples subdirectory of the solution, and the class library that implements the objects is ExpressionDrawer.dll, in the project with the same name. I have taken the language of arithmetic expressions as a basis and I have added the <<graphic>>, <valuedef>, <varrange> and <constvalue> rules to add variable value range definitions and constant values.
The expression must contain only a single variable, although it may contain any number of constants. The ICompilableObjectFactory object is implemented in the GraphicFactory class. This class in turn uses an ExpressionsFactory object with which the elements corresponding to the arithmetic expression are constructed. In the Init method, the instance of this class is created:
public void Init()
{
_eFactory = new ExpressionsFactory();
_graphic = null;
}
And the CreateObject method is as follows:
public ICompilableObject CreateObject(string ctype, IRuleItem item)
{
switch (ctype)
{
case "constvalue":
return new ConstValue();
case "varrange":
return new VariableRange();
case "graphic":
if (_graphic == null)
{
_graphic = new ExpressionDrawer();
}
return _graphic;
default:
return _eFactory.CreateObject(ctype, item);
}
}
You can find an example of source code for this language in the graphsample.txt file, in the Samples subdirectory of the solution:
graphic {
x^2+a;
x from -10 to 10 by 0,1;
a = 2;
}
Which, when compilated, shows the following result:
You can read more about the source code iin the CodeProject website.