Empirical is a language for analyzing tabular data.
Currently in private beta. To request a demo, email
Frequently Asked Questions
How does this compare to SQL?
Empirical is a programming language. It has variables, functions, types, nested scope, arbitrary expressions, etc. It has scalars, vectors, and soon dictionaries. The language-integrated queries are just one part of it.
Empirical is more like R, pandas, q/kdb+, and other column-oriented analytics APIs.
What is column-oriented analytics?
Traditional databases handle queries and aggregations by iterating over rows, which is extremely slow because that violates the CPU’s cache. In Empirical, each column is a vector; reading columns is often two orders of magnitude faster per core than the row-oriented approach. (It’s the performance equivalent of scanning through an array in C++ vs Python.)
Are there drawbacks to column-oriented?
Writing data is the Achilles heel of column-oriented data layout. Write-heavy applications would use a database. But once the application is dominated by reads, the columnar approach wins-out dramatically. Also, time-series applications (which Empirical targets) are generally append-only; in practice time-series writes are almost never done in the middle of the data, so users don’t experience this problem anyway.
What is unique about Empirical?
It is statically typed. When reading from a known input source (such as in the REPL), Empirical will infer the table’s type by sampling the input. It feels dynamic, but because the types are static, Empirical can prevent many common errors in user code.
What is not in scope?
Empirical will never have transactions; it’s not a storage engine. It just performs analytics.