Next storage engine sets new speed record capturing 4.6 million emails, PDF's, and Word documents per hour.
At the time when we in Software Engineering set out to design a new unified storage engine for Multi-Support Next we did so with five primary concerns:
This post addresses the latter three.
The original Next storage engine was based on traditional RDBMS (Relational Database) technology, and for that reason DB2, MySQL, Oracle, and SQL Server where all considered candidates for the new unified storage engine.
Today we are harvesting immense benefits from our design decisions not to go with a general purpose RDBMS. And instead base our storage engine on a specialized document centric design, inspired by the Haystack technology that powers Facebook’s ® impressive photo vaults.
With the initial release of Next for Windows we where proud to provide documentation that Next captured 75,000 documents per hour using a single commodity Intel based server running Windows 2012.
Rock solid performance to match most other solutions in the market.
Now - many, many hours of optimization later - we are releasing the performance benchmark for Next release 8 update level 22.
Benchmark from Next deployed on a single Windows 2008r2 server
with 2x Intel® Xeon® Processor E5607 (2,26 Ghz), 24 GB ram (16 GB reserved to the Next Process),
12x1 TB SATA 7200 rpm data disks (arranged in 4 RAID 5 clusters) and 2x400GB SSD disk for index.
The graph is a result of a test run with 175,000,000 documents (Word, PDF, and Email). All captured and stored in Next, each with a single logical archiving. The graph documents a very consistent performance throughout the entire run, and with an average performance of 4.6 million documents per hour.
Interstellar performance to outshine any other solution in the market.
We believe that by now we have come as far as we reasonably can with regards to performance optimizing the Next storage engine in a single server setup. Clients in need of even bigger capture capacity will need to deploy Next on faster hardware, or more likely in a multi-sever setup. In a future post we will elaborate on these capabilities.
The vast majority of our current and future clients have requirements that are very far from the capabilities documented here. They too will benefit from these capabilities as the ability to run more moderate workloads on very small configurations is an equally important result of our design and optimization efforts.
In a future post I will address exactly how small a configuration you need to run Next. And trust me it’s equally impressive.