Recently, it has been asked what the difference between sequential database and real-time database is. Today I would like to talk about it.
Traditional Industrial Real-time Database
Due to the particularity of traditional industrial control, real-time data processing is highly needed, especially in the process industry, the requirements for controlling each production stage are very strict, and it is necessary to monitor data to reflect the real-time state of the whole system, which makes the processing of real-time data very important.
Difference between Traditional Industrial Real-time Database and Time-series Database
Firstly, although both the databases emphasize high-speed writing performance, their capabilities do differ. Traditional industrial real-time databases generally support more than 2 million data points on a single node, 5,000 concurrent users, and higher-than-one-million-record-per-second data-writing speed. In terms of time series databases, 10 million is the current performance bottleneck of a single node. In the process of optimization, TSDB also focuses on writing more than reading, balancing data compression and read-write amplification, and mainly uses column storage, absorbing new technologies in the software industry.
Secondly, there are also differences between the two in terms of scenarios and ecological tools. The traditional industrial real-time database is actually a set of solutions from data acquisition to visualization. Toolkits for industrial scenarios are more abundant, especially support for hundreds of industrial protocols, as well as data models for various industrial scenarios, such as the OPC interface (OPC is a standard that specifies protocols for control systems and data sources) . However, time series database is not only used in industrial monitoring scenarios, but also useful in DevOps, IoT, finance and other scenarios.
Thirdly, industrial real-time databases also have bottlenecks in scalability. Traditional real-time databases are mostly active and standby deployment architecture, which usually requires high configuration to pursue the ultimate performance of one single machine. At the same time, the stability of the running software is highly demanding, and high-quality code is necessary to ensure stable operation. However, the distributed architecture of the time series databases enables the system to easily expand horizontally, so that the database no longer relies on expensive hardware and storage devices, and uses the natural advantages of clusters to achieve high availability without single-point bottlenecks or failures. It can be run on ordinary x86 servers or even virtual machines, which greatly reduces the cost of use.
Moreover, the costs also differ a lot. Traditional industrial real-time database solutions are generally so expensive that only large enterprises can afford. For example, the PI (Plant Information System) product of OSI Company in the United States costs $6,000 for each interface, and the entire product costs millions of dollars. In contrast, time series databases are open source and free, making it easier for everyone to get started.
Finally, the time series database is more suitable to go to the cloud. Traditional industrial real-time data will be deployed privately. Machines, software and subsequent services are expensive, and professional technicians are required to maintain the system. With the maturity of network and cloud computing technologies, related performance and security are continuously upgraded, and thus the cloud-enabled time series database is more in line with the general trend.