Frequently Asked Questions
What does Algebraix Data do?
We build software that speeds up queries and reduces computing costs in big data environments. Thus, we allow companies to scale query performance with innovative Algebraix Inside™ software, which can be embedded in a range of products, from BI tools to databases. Our first Algebraix Inside™ version works with Apache Spark, AWS, and Azure. We will be adding other versions in the future.
How does the query accelerator work?
When queries are submitted by users, Algebraix Inside™ registers these queries and translates them into Data Algebra™ expressions. Over time, Algebraix Inside™ finds overlapping or similar pieces of queries that can be used and pre-computed to improve performance of other queries that haven’t been yet run on Spark. The software essentially becomes a “computational cache” that can save valuable CPU cycles by reducing pre-computational work.
What is the problem with the current big data market?
Companies have rapidly adopted open source technologies like Hadoop and Spark to save money. But these open source platforms are immature and lack functionality found in mature Relational Database technologies from the likes of IBM, Oracle, Teradata, and Microsoft. These open source platforms include immature optimizers, lack workload management and priority scheduling tools, and have little if any advanced indexing capabilities. The result is poor, unpredictable SQL performance especially on complex, iterative queries and poor user concurrency. As a result, companies are forced to scale performance with expensive hardware. To handle poor user concurrency they are obliged to implement multiple clusters in the cloud to service multiple user groups.
What are the Alternatives?
There are multiple companies attempting to make Spark SQL queries run faster, but their approach is more complicated than Algebraix Inside™. Some competitive approaches require you to implement adjacent data stores and re-direct processing work away from Spark. Others require complex, manual tuning and indexing. Yet others require you to purchase expensive add-on hardware and memory. All are more complex and costly to implement than Algebraix Inside™. In many cases, our product can complement these other approaches.
How is the Algebraix Inside™ implemented?
Product and service vendors can adopt Algebraix Inside™ easily. A proof of concept can be performed with the assistance of one of our consultants. Normally Algebraix Inside™ can be implemented in a way that isolates its activity from etc product or environment it complements.
How does Algebraix Inside™ perform?
On a variety of analytics style queries from the industry standard TPC-H benchmark framework, Algebraix Inside™ automatically and transparently implements optimization techniques that speeds up a wide variety of the queries. Typical performance ranges from 3-10X on select queries. Some are over 100X. Of course, actual performance will vary depending upon the data demographics and query profiles.