Storage capacities have increased substantially, but their error rates remain unchanged and high.
Companies are storing and placing their data on different networks, like cloud storage providers or private, on-premises systems and devices totally unaware of the real danger of silent data corruption – “undetected data corruption, also known as silent data corruption, results in the most dangerous errors as there is no indication that the data is incorrect.”
In the past, using older storage technologies, the probability of data corruption was very small because of the volume of data stored. Today, we store massively more data, from audio, video, numerical data, to documents using private or public social networks, without any ideas how safe our data is.
There can be many error sources beyond the disk storage itself, from loose cables, unreliable power supplies to DMA parity errors, external vibrations or other soft errors. Silent data corruption is difficult to detect and report, usually resulting in “cascading failures, in which the system may run for a period of time with undetected initial error causing increasingly more problems until it is ultimately detected”.
Jeff Bonwick, the technical leader of ZFS stated that, “The fast database at Greenplum, which is a database software company specializing in large-scale data warehousing and analytics, faces silent corruption every 15 minutes. As another example, a real-life study performed by NetApp on more than 1.5 million HDDs over 41 months found more than nearly half a million silent data corruptions, out of which more than 30,000 were not detected by the hardware RAID controller. Another study, performed by CERN over six months and involving about 97 petabytes of data, found that about 128 megabytes of data became permanently corrupted.” (Wikipedia)
Kronometrix Data Integrity
Developed with data integrity in mind, Kronometrix is using OpenZFS technology to detect, correct and prevent silent data corruption errors, supporting:
- long-term data storage: large datastore sizes with zero data loss
- data integrity: using hierarchical checksum of all data and metadata
- native data compression: supports archival years of raw data
Kronometrix connects to a wide variety of data sources: everything from IoT devices, ICT enterprise to weather and environment sensors. In addition to multifaceted data ingress, the distributed data fabric provides high-speed transport for data consolidation, analysis and visualization in real-time.
Built as a ready solution, Kronometrix contains all needed software components, from data collection, to analysis and visualization, on top of core computing capabilities:
- multi-processor configurations
- in-memory processing, supporting RAM or flash memory
- integrated storage and network capabilities