amazonwebservices_logo-svg        white-spark      hadoop    azure_



Spark SQL 50x Faster

The Algebraix Query Accelerator is an optimization software package that can be used alongside Apache Spark to Improve SQL query performance. It is designed to work in conjunction with Amazon Web Services and Microsoft Azure.



How it Works

  1. The Expression Translator translates SQL requests into data algebra.howaqaworks
  2. The Algebraic Optimizer  performs adaptive restructuring on the algebraic version of the SQL query by referencing the expression store to identify which parts of the query match stored results, from prior queries, and which parts of a new query data algebra can resolve.
  3. The Algebraic Optimizer stores these matched results in the Expression Store, effectively allowing queries to become look-ups. It also monitors the contents of the Local Data Store by performing statistical frequency calculations to determine which results are most frequently reused.
  4. The Expression Translator then passes the query, in SQL form, to Spark SQL and catalyst.
  5. When catalyst returns the result of the query, the Algebraic Optimizer retrieves its part of the query from the Expression Store, merges it with the data received from Catalyst and passes the result to the application that made the SQL request.

Note: The Local Data Store holds its data in memory and on disk. Its precise contents vary according to the variety of queries dealt with by AQA. Data that turns out to be rarely used is thus gradually replaced by data that is more frequently reused. Upon introduction of new data, AQA takes note and includes the data, when appropriate, in the result store. The process is then repeated and the SQL environment is subsequently optimized for maximum performance.






Installation is Non Invasive

Using a set of simple to install scripts in AWS Elastic Map Reduce, adding our software to your cluster is a breeze.

before_aqa         after_aqa

Integration is Simple

Once the package is installed, only a single line of code changes in your application.

before_script   after_script


Our Technology Patents

The Algebraix Technology Platform is based on our fundamental innovation in the field of applied mathematics: the algebra of data. The company is building a portfolio of patents around its technology platform. We currently hold nine U.S. patents and expect to receive dozens more.

U.S. Patents Granted to Date
  • 7613734 Systems and Methods for Providing Data Sets using a Store of Algebraic Relations
  • 7720806 Systems and Methods for Data Manipulation using Multiple Storage Formats
  • 7769754 Systems and Methods for Data Storage and Retrieval using Algebraic Optimization
  • 7797319 Systems and Methods for Data Model Mapping
  • 7865503 Systems and Methods for Data Storage and Retrieval using Virtual Data Sets
  • 7877370 Systems and Methods for Data Storage and Retrieval using Algebraic Relations
  • 8032509 Systems and Methods for Data Storage and Retrieval using Algebraic Relations Composed from Query Language Statements
  • 8380695 Systems and Methods for Data Storage and Retrieval Using Algebraix Relations to Optimize Calculations
  • 8583687 Systems and Methods for Indirect Algebraic Partitioning

Experience Data Algebra for Yourself

Introducing the Algebraix Open-Source Library

We now provide a way for users to test-drive data algebra: the Algebraix Library. It is particularly aimed at letting data scientists create applications that are standardized, provably correct, efficient, inclusive, and expressive. This is a free, educational, experiential asset, not a commercial product.

The Algebraix Library (built in Python) is designed to help people learn, use, and improve data algebra. All enhancement submissions will be considered for inclusion in the Algebraix Library. It is available on these publication sites:

  • GitHub, the largest repository hosting site in the world. Algebraix is using GitHub to host all material available to the public, including source code, documentation, and tutorials.
  • Read the Docs, a site that complements GitHub and similar sites. It provides rich documentation functionality, including making it fully searchable and maintaining all versions.
  • PyPI (Python Package Index), a Python repository that catalogs open-source Python packages. The public Algebraix Library is available here through pip install directive or as a download.


Put our Query Accelerator to Work

Speak with our experts and see what our software can do for your big data queries