Commit dd3b0c1c authored by Kai-Holger Brassel's avatar Kai-Holger Brassel
Browse files

Two of four parts nearly finished

parent f63ec247
<p xmlns:dct="" xmlns:cc="" class="license-text"><a rel="cc:attributionURL" property="dct:title" href="">How to Implement Parameter Catalogs with Eclipse</a> by <span property="cc:attributionName">Kai-Holger Brassel</span>, Hamburg, is licensed under <a rel="license" href="">CC BY-NC-ND 4.0<img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="" /><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="" /><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="" /><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="" /></a></p>
\ No newline at end of file
= Parameter Catalogs for Simulation
Kai-Holger Brassel, Hamburg, <>
:toclevels: 2
This work by Kai-Holger Brassel, Hamburg, is licensed under[CC BY-NC-ND 4.0]
Version: October 19^th^, 2020.
This diff is collapsed.
== Introduction
This introduction talks about the work of the author and others, but without bibliographic references. Currently, it is just meant as background to better understand the technical documentation in the sections to follow.
Maybe it could be developed into a more serious paper later.
The overall motivation for the work on parameter catalogs for simulation is to make easier to develop and perform computer simulations in complex and _data rich_ domains like building physics, transportation, and all kinds of urban infrastructure.
=== The Bigger Picture
A good part of computer science was and is driven by the motivation to make it easier to develop computer programs of all sorts.
"Higher" programming languages were invented to make programs human readable and soon special constructs for _functional programming_ (computation without side effects) and _structured programming_ (computation without go to statements) were introduced to help programmers writing and understanding ever growing programs.
Then, between 1962 and 1967, program language Simula was developed especially to deal with the challenges of simulating systems comprising of many different types of objects.
This opened the door to more direct computer representations of real world objects, their attributes, relationships and behavior, ultimately leading to _object-oriented_ software development that today is embodied in programming languages like Java, C++, Python, and graphical notations like the Unified Modeling Language (UML).
While these achievements had boosted the productivity of software developers, still the creation of correct, efficient and maintainable programs -- including simulations -- required a big deal of expert knowledge and experience.
To overcome this bottleneck, starting in the 70s, so called 4th generation languages entered the stage.
These languages were tailored to specific tasks like statistics ("S" 1976, "R" being its successor), database programming (SQL 1979), or simulation (MATLAB around 1979, Mathematica 1988, Modellica 1999) to name a few.
By sacrificing generality, these special languages become more accessible to domain experts, not just trained software developers.
To flatten the learning curve even more, formal _graphical_ languages for special purposes were invented, e.g. Simulink for block diagram simulation models in 1984, Entity-Relationship-Diagrams for data modeling in 1976, UML for object-oriented systems design in the 1990s, or graphical languages to specify business and also scientific workflows around 2000.
This very short history of technologies for development of software in general, and simulations in particular, shall illuminate the tools at our disposal:
* general purpose programming languages that combine structured, functional and object-oriented approaches to enable the creation of big, modular software systems, often called "programming in the large"
* formal textual domain specific languages (DSLs) dedicated to solve specific tasks with ease
* formal graphical DSLs.
Note that DSLs more tend to describe _what_ shall be achieved by a computation instead of describing in detail, _how_ to achieve it.
Therefore, DSLs usually look more like a model than like an algorithm.
Now back to the task at hand.
Some domains deal with a few types of simple objects to be simulated.
Take the building blocks of an electric circuit as an example.
The algorithms to simulate these correctly and efficiently may be quite complex -- the model elements usually can be described by very few parameters like resistance or capacity.
More complex domains like (regenerative) energy systems or building physics deal with more complex objects to be simulated, e.g. PV modules or layered walls of buildings, often coming in different types and configurations, and dozens of possibly interdependent parameters.
=== Lessons Learned
First a note on terminology: Instead of _parameter catalogs_ in SimStadt we used term _library_ like in _building physics library_. Obviously this was not a good choice, since _library_ is used a lot in IT and programming with all sorts of meaning. Instead we started to talk about _data catalogs_, but in data science this term has specific meaning, namely: catalogs of data and data sources.
Since our catalogs, first of all, shall grant structured access to parameters for simulated entities _parameter catalog_ sounds more appropriate to me.
The problem of navigating huge parameter spaces and assembling complex simulation models popped up as the author worked on a diagram editor for *INSEL*, a simulation language and runtime environment developed for renewable energy systems simulation.
To make existing catalogs on weather data, solar panels and inverter modules accessible to the modeler, special dialogs were added to the INSEL user interface that allowed browsing through the catalogs.
Using this browsers, the modeler would choose a weather station, panel or inverter to parameterize a corresponding INSEL function-block.
However, there are some severe disadvantages with this approach:
. Parameter catalogs were stored in a proprietary data format on disk within the INSEL application distribution, meaning they could not used independently from INSEL by other interested parties (systems or users).
. The catalogs have to be maintained by editing text files manually.
. While INSEL modeler could browse the catalogs, searching and sorting were not supported.
. Development of Java Swing UIs for the different kind of catalogs is time consuming as is their maintenance, e.g. if a catalog data format were to change.
. Putting UIs to handle big amounts of data into a diagram editor is not very user friendly.
From 2013 to 2016, the simulation platform *SimStadt* was developed to make specific modeling and simulation workflows accessible to experts in urban planning and energy systems.
Using INSEL and other simulators under the hood, the usage of 3D city data, provided as CityGML files, was a core requirement of this project.
To enable simulation of, say, the heating demand of a district, geometric building data had to be enriched with data on building physics and usage.
To do so, existing informations about building physics and usage -- often only available as informal typologies or tables -- had to be provided to the SimStadt user on an abstract level, e.g. to choose between refurbishment scenarios.
At the same time, specific building configurations and parameter sets had to be injected into the simulation models to obtain the desired results.
Again, we implemented parameter catalogs to fulfill these requirements, but compared to the quite simple catalogs used in INSEL, the data for building materials, window, wall and roof types as well as the typologies of buildings, households, usage patterns, and so on were more intricate.
They had to be created iteratively in collaboration with domain experts.
In this situation, manual coding data formats and access with a general programming language would have led to relatively long iteration cycles and high communication effort between programmer and domain expert.
Instead, we decided to use a DSL for data modeling and use code generation whenever possible.
Since SimStadt was developed within the Java eco-system we followed this standard approach:footnote:[A similar approach is in use to standardize extensions to CityGML via so called application domain extensions (ADE) like the energy ADE for exchanging energy related data.]
. Developer and domain expert create a first version of the data model as XML Schema Definition (our DSL).
. For plausibility checks one would use any standard XML editor to create example data conforming to the XSD.
. With JAXB (Java Architecture for XML Binding) Java code is generated to read our XML catalogs into Java objects that, in turn, can be accessed by SimStadt workflows to generate and parameterize simulations.
. If required, developer and domain expert go back to step one to refine data model and catalog data.
After the data model for building physics catalogs had matured, we developed a desktop application for convenient creation and maintenance of building physics data catalogs separate from SimStadt.
It was developed in Java with a user interface written in JavaFX and was well received by domain experts.
However, as a different catalog -- this time for building usages -- had to be created, it was quite difficult to reuse the XML schema and application code from the building physics catalog: The usage catalog data model was "pressed" into a form similar to the building physics catalog data model, and the UI code was "over-engineered" to accommodate both catalog's requirements.
=== Low-Code-Development of Parameter Catalogs
From INSEL and SimStadt we learned, that manual and automatic construction and parameterization of complex simulation models with many types of interrelated objects should be supported be the means of domain specific parameter catalogs.
Close collaboration with domain experts in designing and implementing these catalogs in short development cycles is desirable.
Parameter catalogs and the software for their creation, maintenance and deployment should be independent of any specific simulation software, (a) to be reusable and (b) not to overload simulation applications.
In SimStadt, catalog development was partly facilitated by a textual DSL for data modeling (XML schema language) and automatic generation of Java code from it.
On the other hand, user interfaces and generation and parameterization of simulations from templates within SimStadt workflows had still to be coded manually hindering the routinely creation of new catalogs.
Now, in 2020, several developments in different projects provide an opportunity to re-think the topic of parameter catalogs for simulations, namely:
. Plans for a new Urban Simulation Platform at Concordia University, Montreal.
. New implementation of INSEL user interface based on the Eclipse application framework and Eclipse-Sirius diagram editors.
. Enhancement of existing building physics and usage catalogs from SimStadt and their adaptation to new regions.
. Development of a new comprehensive catalog of electric systems components to be used in SimStadt as well as in Concordia's Urban Simulation Platform.
In what follows, the new technology stack used to implement (4) is documented in detail.
It uses four technologies to replace manual coding by code generation from models:
* _Ecore_ to model the catalog's data and generate Java classes and persistence layer from it.
* _EMF Forms_ for modeling and generating tables, forms and buttons to **c**reate, **r**ead, **u**pdate, and **d**elete data (CRUD).
* _E4_, the Eclipse way of modeling the application user interface itself, e.g. the placement and behavior of views, editors, toolbars, menus, and more.
* A template engine called _Handlebars_ to generate fully parameterized simulation models from textual templates without programming.
The new technology stack is rooted in the Eclipse application framework and eco-system.footnote:[A comparable, but completely different approach would be to combine several web applications and services via portal software in web browsers.]
Its main advantage is the possibility to implement CRUD applications like parameter catalogs and their underlying data models with no or very view lines of handwritten code (_low-code-development_).
Plans are to use the same approach also for implementation of (3).
Since task (2) and maybe (1) will use Eclipse, too, close integration of parameter catalogs and simulation environments seems feasible.
E.g., a user could drag an electric system component from a catalog onto an INSEL block for parametrization.
The Eclipse application framework offers:
* OSGI plug-in mechanism and UI framework for integrating applications and services
* _E4_ application model to declaratively describe user interface's structure
* General notion of _project_ with specific file types, help system, preferences etc.
* IDE support for important general purpose languages like Java,[Python], Ruby, C, Fortran, C++
* Support for creating textual and graphical DSLs ([XText],[Sirius])
* Industry proven DSLs and code generators for data models and form based UIs via the[_Eclipse Modeling Framework_] (EMF) providing:
**[_Ecore_] for model driven generation of Java classes and persistence layers for XML or data bases
**[_EMF Forms_] for describing and generating form based UIs
** Mechanisms to adapt or extend data models and forms to special needs (e.g., we added quantities -- that is numbers _with units_ -- to Ecore and EMF Forms, a feature very important for parameter catalogs)
* Rich open source eco-system with lots of plugins and projects important for an urban simulation platform:
** model server for distributed access and work on Ecore models, including model comparison and migration ([CDO],[EMFCompare])
** a[Python implementation of Ecore]
** GIS: storage, processing, and visualization of geographical data (list of projects under the umbrella[LocationTech], e.g. user-friendly desktop internet GIS[uDig])
** workbench for traffic simulation ([SUMO])
** spatial multi-agent-simulation ([GAMA-Platform])
** scientific workflows ([Triquetrum])
** visualizations ([Nebula])
** machine learning ([deeplearning4j])
** 45+ projects in the area of[IoT]
** ...
As always, all that glitters is not gold. When we go through the details below, some bugs and inconsistencies, typical for open source projects of this age and size, have to be addressed.
This diff is collapsed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment