Semantic data model

Overview of the semantic data model used to structure time series and meta data of plants, buildings, districts, or energy systems in order to apply generic algorithms.

Overview¶

The .io platform enables the semantic and unified data modelling of time series and metadata by mapping these data to our own Component data models. Our component data models are based on real components of buildings and energy systems, allowing an intuitive structuring of the underlying data. The created components can be leveraged by our .analytics and .controls algorithms. Our API allows access to the data stored as components, and the components can be easily integrated into third-party algorithms via the API.

In this article, we first provide information on semantic modelling of energy systems with our component data models, followed by an explanation of the mapping process. Subsequently, we will show how data points are modeled in accordance with the Project Haystack guidelines and vocabulary and how Haystack entities can be mapped to our component data model. Lastly, we will show how to apply the semantic data model in generic algorithms.

Semantic modelling¶

A granular approach to semantic modelling allows for better understanding of the individuality of plants, buildings, districts, and energy systems. For example, a heating circuit can be composed of semantic component models of boiler(s), pump(s), valve(s), pipe(s), and heat consumer(s). Since those semantic models are describing components, we named the data structures to store semantic data component data models. Component data models provide generic placeholders, describing a certain semantic to which data can be mapped. Pins are the placeholders for datapoints and their time series data. Attributes are the placeholders for meta data. To utilize a component data model for a specific project, a component is instanced for this project. The instanced component is individualized for the project by mapping specific datapoints and meta data to their placeholders.

For more information, view the list of available component data models.

Info

To put it in a nutshell: Large-scale systems such as plants, buildings, districts, and energy systems are modeled using sets of individual component data models. Mapping datapoints and attributes to those components, structures time series and metadata for the modeled system.

Mapping process¶

Mapping is the process of connecting datapoints to the pins of a component as well as writing meta data to attributes of a component inside a specific project.

Mapping datapoints to pins of instanced components specifies the semantic of datapoints and their component affiliation. Information on a component affiliation can be taken from planning documentations, system overviews, and also native meta data.

Setting attributes of components adds information about dimension and characteristics of the instanced component. Attribute values can be taken from data sheets of technical equipment, planning documentation of the system, or can simply be estimated.

As soon as an instanced component is mapped with datapoints and/or attributes, generic algorithms of the platform are applicable.

Application in algorithms¶

The semantic structure of time series and metadata is the glue that connects data of a specific project and algorithms. To make this statement more tangible: If a datapoint of a specific project is semantically described to be the sensor of the supply temperature of a boiler in a specific project, any algorithm executing determinations on time series data from a supply temperature sensor of a boiler can use this information to address the specific datapoint. This means, algorithms use the component data model pins and attributes as input variables and the mapping information of instantiated components as the input variable values.

For more information on the availability of algorithms for a specific component data model, refer to analysis functions.

Project Haystack and component data models¶

Project Haystack is an initiative to standardize semantic data models and web services, making it easier to unlock value from the vast quantity of data generated by smart devices in homes, buildings, factories, and cities. Haystack defines a fixed set of general-purpose data types called kinds to facilitate the interoperable exchange of data. It can be thought of as an extended version of JSON, supporting JSON data types directly and adding types for further data. The primary collection type is a dict, which is a set of name and value pairs called tags. Haystack also includes a special data type called marker which are tag names without a value. Instead, it is the presence of the marker that is semantically meaningful. Markers are predefined from a set of vocabularies, while the value of a tag is generally not predefined and can be chosen individually on a project basis. Semantic information results not only from the choice of tags and markers, but also from the context and relationships between the entities. In a haystack model the term entity describes a unique instance of an object or concept like a room, sensor, pump or system. An entity can have Properties and relationships described with tags and markes and is always modelt as a dict (collection of tags). The value of a relationship tag like siteRef or equipRef is always the id of the connected entity (see Table 1).


id	@room_test
area	15
dis	Test_Room_01
doc	Example modeling of a Room Entity
siteRef	@building01
site	-
room	-

Table 1 Example of a model for a room entity in Haystack with five tags and two markers

For further information on Project Haystack look at the documentation here.

Challenges with the data model conversion¶

One of the main challenges is that the vocabulary of Project Haystack is standardized, but the way an entity is modeled and which tags and markers are used is not. To automatically convert Haystack entities to components, we need shared indicators that are common to all Haystack entities and are predefined. This means that tags are not suitable to be used as a mapping indicator, because they are too specific and vary from one entity to another. Even though tags like the dis or doc tag are useful to distinguish between entities and contain valuable meta data about an entity and its semantics, markers are more suitable, because they are predefined and have a fixed meaning. However, the problem is that the choice of markers is still individual and may not be consistent across different entities. Moreover, in Haystack, some information is derived from the relationships between the entities. For example, it is possible that the air temperature sensor is modeled the same way everywhere, regardless of whether it is in a room or an air handling unit. The information about which component the air temperature sensor belongs to is derived from the siteRef or equipRef tag.

Info

The main challenge is that even though vocabularies are predefined, the modeling approach of a Haystack entity can vary and, unlike the component data models, some information is derived from the relationships between entities and not solely based on the meaning of a pin in a component.

Conversion approach¶

To automatically convert entities from Haystack to the component data model, we can define a set of possible markers that can describe a certain pin in a component model. You can finde an example for a selection of pins for the room component in table 2 below. This is possible because all markers are predefined in Haystack and tied to a definition.

Pin of component model	Description	Possible Haystack marker
ROOM	ROOM	site, room, space
ROOM+CO2_SP	CO2 Setpoint	point, sensor, co2, sp
ROOM+T_AIR_ODA	Outside Air Temperature	point, sensor, temperature, air, outside
ROOM+T_AIR_SP	Room Temperature Setpoint	point, temperature, air, sp
ROOM+T_AIR_SP_ADJ	Room Temperature Setpoint Adjustment	point, temperature, air, sp

Table 2 Example of pins from the room component data model and corresponding Haystack markers

It is important to note that the use of markers alone is not sufficient to uniquely describe all pins. An example can be seen in the last two lines of Table 2. The fact that the last line is an adjustment of the room temperature setpoint and not just the room temperature setpoint itself, is not obvious from the possible markers alone. There is no unique tag for the pin Room Temperature Setpoint Adjustment in the Haystack vocabulary. During a matching process, both pins would still be suggested and further information from other tags may be needed to find the correct counterpart. To reduce the ambiguity of the existing Haystack definitions, we may need to provide additional definitions for some pins. This way, we can narrow down the possible matches and simplify the conversion process from Haystack to the component data model

To overcome the issue that different entities in different components are modeled the same, we cluster the Haystack entities by reading all relationship tags. By following and reading these relationships, we can group together all the entities that are connected and thus gather the semantic information of the relationships. For example, by doing this, we can group together all entities that are connected to an AHU and distinguish them from those that are in a room or another component.

Conclusion¶

The approach to representing semantic information with Haystack and the component data models is fundamentally different, which makes it challenging to automatically convert Haystack entities to component data models. In Haystack, semantic meaning is attached to each marker and tag. The semantic meaning of an entity results from the combination of those tags and markers. Further semantic meaning can be derived from the relationships between entities.

In contrast, component data models use predefined components based on real-world devices and systems, where each component, pin, and attribute has a fixed semantic meaning. This means that conversion from Haystack to component models involves translating from a very flexible structure to a very rigid one.

In conclusion, to fully automatically convert a haystack model to a component model or vice versa, we need further information about project-specific modeling guidelines besides the more generic approaches of clustering by relationships and the marker list. For example, information on how the values of the dis tag are structured and what we can derive from those informations or more generally, what project individual measures were taken to uniquely identify entities within the data model, and the values of tags in general are structured.