Distributed measurement systems are used today in many research and infrastructure scenarios. They enable the continuous collection of environmental and state data over long periods of time and across different locations – often in inhabited or hard-to-access environments.
At the Leibniz Institute for Tropospheric Research (TROPOS), such measurement systems are operated within the Department of Atmospheric Microphysics (AMP). As part of the EU-funded EDIAQI project, custom-developed sensors continuously collect indoor air quality data in private households. For the reliable transmission of these measurement data, EMQX by EMQ is used as a central MQTT platform. The collected data provide a basis for scientific analysis as well as for improving the understanding of indoor air quality and ventilation behavior.
The challenge: Operating distributed sensor networks reliably and at scale
In distributed measurement scenarios such as EDIAQI and in research infrastructures like ACTRIS, several established requirements come together: many measurement devices operate in parallel, the sensor landscape is heterogeneous, locations are geographically distributed, and on-site access is often associated with considerable effort. At the same time, data streams must remain consistent over long periods of time so that datasets can later be compared and analyzed.
Scalability and reliability
Measurement campaigns and long-term observations require the parallel collection of synchronized data streams from many devices. If individual sensors fail or connections become unstable, data gaps occur that directly affect scientific time series.
Operation without permanent on-site access
For measurement devices deployed in private households, data collection must run continuously without disturbing the daily routines of participants. As a result, operation shifts toward remote operation: system states need to be visible before disruptions become apparent in the data.
Data quality under heterogeneous conditions
Undetected sensor faults or deviations lead to inconsistent datasets. Their effects often only become visible later during analysis, when missing or implausible values increase the effort required for data cleaning and interpretation.
Long-term maintainability and extensibility
Research infrastructures are designed for operation over many years. The architecture therefore needs to be extensible, support redundancy concepts, and handle growing data volumes as well as increasing real-time requirements.
The central question was:
How can distributed sensors be connected in a way that allows data to be collected continuously and synchronously – with reasonable operational effort and with an architecture that remains maintainable over long periods of time?
The solution: An MQTT-based architecture for stable data flows
To operate distributed sensor networks reliably over long periods of time, the project team decided on an MQTT-based architecture designed for continuous data transmission, redundancy, and simple operation. The goal was not only the reliable transmission of individual measurement values, but a structure that ensures synchronization, transparency, and maintainability during ongoing operation.
For implementing this architecture, EMQX was used. Relevant for the selection were in particular the clustering capabilities and the simple container-based deployment provided by EMQ, which allowed fast integration into the existing infrastructure.
The custom-developed AQBIE sensors (Air Quality Beacon & Immission Evaluator) communicate via the MQTT protocol with a central broker infrastructure. MQTT was deliberately chosen as a lightweight, event-based protocol to provide stable data flows even with a large number of distributed devices and without permanent on-site access.
The architecture follows a clear principle: central coordination combined with redundancy and spatial decoupling.
Central broker structure with redundancy
An EMQX cluster with two nodes behind a load balancer forms the backbone of data transmission. This structure supports continuous operation even in the event of individual component failures.
Distributed broker structures with bridging
In addition, two further EMQX deployments at separate physical locations are connected in bridge mode. This allows data synchronization and provides additional redundancy – an important aspect for long-term measurements and distributed research infrastructures.
Unobtrusive operation in private environments
The AQBIE sensors transmit their measurement data continuously and remotely. Physical access to the devices is not required during normal operation, while system states can be monitored centrally.
Data storage and monitoring as an integral part
The collected time series data are stored in a TimescaleDB. Grafana dashboards and Prometheus support visualization, system monitoring, and early fault detection, so that anomalies can be identified before they affect later analysis.
A key advantage of the solution lies in its operational friendliness. Through the container-based deployment of EMQX, the architecture can be deployed and extended easily and remains manageable even for scientific teams without deep IT specialization.
The result: More stable operation, reduced manual effort, and new options
With the introduced MQTT-based infrastructure based on EMQX, teams at the Leibniz Institute for Tropospheric Research – in particular within the Department of Atmospheric Microphysics (AMP) – were able to reliably collect and monitor continuous real-time data from distributed sensors during ongoing operation. The state of the sensors and data transmission is transparently traceable.
As a result, deviations become visible earlier and can be classified before they appear as data gaps or inconsistencies in later analysis. The effort required for maintenance, fault analysis, and subsequent data cleaning is reduced because system states and deviations are identifiable at an early stage.
Although the project is still in a pilot phase and no quantified metrics are yet available, a clear operational benefit is already apparent:
- more stable and consistent datasets over long periods of time
- earlier identification of sensor issues during ongoing data collection
- reduced operational effort
- an improved basis for timely analysis and well-founded interpretation of measurement data
In addition, the architecture opens up new options. Functions such as more fine-grained real-time monitoring or the integration of additional sensors and projects become practicable without disproportionately increasing operational complexity. The infrastructure thus evolves from a pure data collection solution into a sustainable operational platform for distributed measurement systems.
Perspective: Standardized data infrastructure for long-term operation
MQTT and the broker architecture in use are now an integral part of current and planned projects at TROPOS. In particular within the European research infrastructure ACTRIS, which is designed for continuous operation over more than ten years, a standardized and scalable data infrastructure plays a central role.
The experiences from the EDIAQI project show that MQTT-based architectures are not only suitable for industrial applications, but also for scientific and societally relevant measurement networks. They provide a robust basis for long-term data collection, facilitate exchange between institutions, and improve access to high-quality environmental and atmospheric data.