Top Columnar Databases for Startups in 2024

Find and compare the best Columnar Databases for Startups in 2024

Sort:

Columnar Databases Startup Reset Filters

Use the comparison tool below to compare the top Columnar Databases for Startups on the market. You can filter results by user reviews, pricing, features, platform, region, support options, integrations, and more.

1

Google Cloud BigQuery

Google
$0.04 per slot hour

1,686 Ratings

See Software
Learn More

ANSI SQL allows you to analyze petabytes worth of data at lightning-fast speeds with no operational overhead. Analytics at scale with 26%-34% less three-year TCO than cloud-based data warehouse alternatives. You can unleash your insights with a trusted platform that is more secure and scales with you. Multi-cloud analytics solutions that allow you to gain insights from all types of data. You can query streaming data in real-time and get the most current information about all your business processes. Machine learning is built-in and allows you to predict business outcomes quickly without having to move data. With just a few clicks, you can securely access and share the analytical insights within your organization. Easy creation of stunning dashboards and reports using popular business intelligence tools right out of the box. BigQuery's strong security, governance, and reliability controls ensure high availability and a 99.9% uptime SLA. Encrypt your data by default and with customer-managed encryption keys
2

Sadas Engine

Sadas

7 Ratings

See Software

Sadas Engine is the fastest columnar database management system in cloud and on-premise. Sadas Engine is the solution that you are looking for. * Store * Manage * Analyze It takes a lot of data to find the right solution. * BI * DWH * Data Analytics The fastest columnar Database Management System can turn data into information. It is 100 times faster than transactional DBMSs, and can perform searches on large amounts of data for a period that lasts longer than 10 years.
3

Apache Cassandra

Apache Software Foundation

1 Rating

See Software

The Apache Cassandra database provides high availability and scalability without compromising performance. It is the ideal platform for mission-critical data because it offers linear scalability and demonstrated fault-tolerance with commodity hardware and cloud infrastructure. Cassandra's ability to replicate across multiple datacenters is first-in-class. This provides lower latency for your users, and the peace-of-mind that you can withstand regional outages.
4

ClickHouse

ClickHouse

1 Rating

See Software

ClickHouse is an open-source OLAP database management software that is fast and easy to use. It is column-oriented, and can generate real-time analytical reports by using SQL queries. ClickHouse's performance is superior to comparable column-oriented database management software currently on the market. It processes hundreds of millions of rows to more than a million and tens if not thousands of gigabytes per second. ClickHouse makes use of all hardware available to process every query as quickly as possible. Peak processing speed for a single query is more than 2 Terabytes per Second (after decompression, only utilized columns). To reduce latency, reads in distributed setups are automatically balanced between healthy replicas. ClickHouse supports multimaster asynchronous replication, and can be deployed across multiple datacenters. Each node is equal, which prevents single points of failure.
5

Snowflake

Snowflake
$40.00 per month

4 Ratings

See Software

Your cloud data platform. Access to any data you need with unlimited scalability. All your data is available to you, with the near-infinite performance and concurrency required by your organization. You can seamlessly share and consume shared data across your organization to collaborate and solve your most difficult business problems. You can increase productivity and reduce time to value by collaborating with data professionals to quickly deliver integrated data solutions from any location in your organization. Our technology partners and system integrators can help you deploy Snowflake to your success, no matter if you are moving data into Snowflake.
6

Rockset

Rockset
Free

See Software

Real-time analytics on raw data. Live ingest from S3, DynamoDB, DynamoDB and more. Raw data can be accessed as SQL tables. In minutes, you can create amazing data-driven apps and live dashboards. Rockset is a serverless analytics and search engine that powers real-time applications and live dashboards. You can directly work with raw data such as JSON, XML and CSV. Rockset can import data from real-time streams and data lakes, data warehouses, and databases. You can import real-time data without the need to build pipelines. Rockset syncs all new data as it arrives in your data sources, without the need to create a fixed schema. You can use familiar SQL, including filters, joins, and aggregations. Rockset automatically indexes every field in your data, making it lightning fast. Fast queries are used to power your apps, microservices and live dashboards. Scale without worrying too much about servers, shards or pagers.
7

Amazon Redshift

Amazon
$0.25 per hour

See Software

Amazon Redshift is preferred by more customers than any other cloud data storage. Redshift powers analytic workloads for Fortune 500 companies and startups, as well as everything in between. Redshift has helped Lyft grow from a startup to multi-billion-dollar enterprises. It's easier than any other data warehouse to gain new insights from all of your data. Redshift allows you to query petabytes (or more) of structured and semi-structured information across your operational database, data warehouse, and data lake using standard SQL. Redshift allows you to save your queries to your S3 database using open formats such as Apache Parquet. This allows you to further analyze other analytics services like Amazon EMR and Amazon Athena. Redshift is the fastest cloud data warehouse in the world and it gets faster each year. The new RA3 instances can be used for performance-intensive workloads to achieve up to 3x the performance compared to any cloud data warehouse.
8

Querona

YouNeedIT

See Software

We make BI and Big Data analytics easier and more efficient. Our goal is to empower business users, make BI specialists and always-busy business more independent when solving data-driven business problems. Querona is a solution for those who have ever been frustrated by a lack in data, slow or tedious report generation, or a long queue to their BI specialist. Querona has a built-in Big Data engine that can handle increasing data volumes. Repeatable queries can be stored and calculated in advance. Querona automatically suggests improvements to queries, making optimization easier. Querona empowers data scientists and business analysts by giving them self-service. They can quickly create and prototype data models, add data sources, optimize queries, and dig into raw data. It is possible to use less IT. Users can now access live data regardless of where it is stored. Querona can cache data if databases are too busy to query live.
9

CrateDB

CrateDB

See Software

The enterprise database for time series, documents, and vectors. Store any type data and combine the simplicity and scalability NoSQL with SQL. CrateDB is a distributed database that runs queries in milliseconds regardless of the complexity, volume, and velocity.
10

Vertica

OpenText

See Software

The Unified Analytics Warehouse. The Unified Analytics Warehouse is the best place to find high-performing analytics and machine learning at large scale. Tech research analysts are seeing new leaders as they strive to deliver game-changing big data analytics. Vertica empowers data-driven companies so they can make the most of their analytics initiatives. It offers advanced time-series, geospatial, and machine learning capabilities, as well as data lake integration, user-definable extensions, cloud-optimized architecture and more. Vertica's Under the Hood webcast series allows you to dive into the features of Vertica - delivered by Vertica engineers, technical experts, and others - and discover what makes it the most scalable and scalable advanced analytical data database on the market. Vertica supports the most data-driven disruptors around the globe in their pursuit for industry and business transformation.
11

Google Cloud Bigtable

Google

See Software

Google Cloud Bigtable provides a fully managed, scalable NoSQL data service that can handle large operational and analytical workloads. Cloud Bigtable is fast and performant. It's the storage engine that grows with your data, from your first gigabyte up to a petabyte-scale for low latency applications and high-throughput data analysis. Seamless scaling and replicating: You can start with one cluster node and scale up to hundreds of nodes to support peak demand. Replication adds high availability and workload isolation to live-serving apps. Integrated and simple: Fully managed service that easily integrates with big data tools such as Dataflow, Hadoop, and Dataproc. Development teams will find it easy to get started with the support for the open-source HBase API standard.
12

Greenplum

Greenplum Database

See Software

Greenplum Database®, an open-source data warehouse, is a fully featured, advanced, and fully functional data warehouse. It offers powerful and fast analytics on petabyte-scale data volumes. Greenplum Database is uniquely designed for big data analytics. It is powered by the most advanced cost-based query optimizer in the world, delivering high analytical query performance with large data volumes. The Apache 2 license is used to release Greenplum Database®. We would like to thank all of our community contributors. We are also open to new contributions. We encourage all contributions to the Greenplum Database community, no matter how small. Open-source, massively parallel data platform for machine learning, analytics, and AI. Rapidly create and deploy models to support complex applications in cybersecurity, predictive management, risk management, fraud detection, among other areas. The fully integrated, open-source analytics platform is now available.
13

Apache Druid

Druid

See Software

Apache Druid, an open-source distributed data store, is Apache Druid. Druid's core design blends ideas from data warehouses and timeseries databases to create a high-performance real-time analytics database that can be used for a wide range of purposes. Druid combines key characteristics from each of these systems into its ingestion, storage format, querying, and core architecture. Druid compresses and stores each column separately, so it only needs to read the ones that are needed for a specific query. This allows for fast scans, ranking, groupBys, and groupBys. Druid creates indexes that are inverted for string values to allow for fast search and filter. Connectors out-of-the box for Apache Kafka and HDFS, AWS S3, stream processors, and many more. Druid intelligently divides data based upon time. Time-based queries are much faster than traditional databases. Druid automatically balances servers as you add or remove servers. Fault-tolerant architecture allows for server failures to be avoided.
14

DataStax

DataStax

See Software

The Open, Multi-Cloud Stack to Modern Data Apps. Built on Apache Cassandra™, an open-source Apache Cassandra™. Global scale and 100% uptime without vendor lock in You can deploy on multi-clouds, open-source, on-prem and Kubernetes. For a lower TCO, use elastic and pay-as you-go. Stargate APIs allow you to build faster with NoSQL, reactive, JSON and REST. Avoid the complexity of multiple OSS projects or APIs that don’t scale. It is ideal for commerce, mobile and AI/ML. Get building modern data applications with Astra, a database-as-a-service powered by Apache Cassandra™. Richly interactive apps that are viral-ready and elastic using REST, GraphQL and JSON. Pay-as you-go Apache Cassandra DBaaS which scales easily and affordably
15

MariaDB

MariaDB

See Software

MariaDB Platform is an enterprise-level open-source database solution. It supports transactional, analytical, and hybrid workloads, as well as relational and JSON data models. It can scale from standalone databases to data warehouses to fully distributed SQL, which can execute millions of transactions per second and perform interactive, ad-hoc analytics on billions upon billions of rows. MariaDB can be deployed on prem-on commodity hardware. It is also available on all major public cloud providers and MariaDB SkySQL, a fully managed cloud database. MariaDB.com provides more information.
16

kdb+

KX Systems

See Software

A cross-platform, high-performance historical time-series database featuring: - An in memory compute engine A real-time stream processor - A query and programming language that is expressive called q kdb+ is the engine behind kdb insights portfolio and KDB.AI. Together, they deliver time-oriented data insight and generative AI capabilities for the world's largest enterprise organizations. kdb+ is the fastest columnar analytics database in memory, according to independent benchmarking*. It delivers unmatched value for businesses that operate in the most challenging data environments. kdb+ helps businesses navigate rapidly changing data environments by improving decision-making processes.
17

MonetDB

MonetDB

See Software

Choose from a wide range of SQL features to realise your applications from pure analytics to hybrid transactional/analytical processing. MonetDB returns queries in seconds, if not faster, when you are curious about your data and when you need to work efficiently. You can (re)use your code when you need specialised function: Use the hooks to add your user-defined functions to SQL, Python R, C/C++, or R. Join us to expand the MonetDB community that spans 130+ countries. We have students, teachers, researchers and small businesses. Join the most important Database in Analytical Jobs to surf the innovation! MonetDB's simple setup will quickly get your DBMS up to speed.
18

Apache HBase

The Apache Software Foundation

See Software

Apache HBase™, is used when you need random, real-time read/write access for your Big Data. This project aims to host very large tables, billions of rows and X million columns, on top of clusters of commodity hardware.
19

Azure Table Storage

Microsoft

See Software

Azure Table storage can store petabytes semi-structured data at low costs and keeps costs down. Table storage is able to scale up, unlike many cloud-based or on-premise data stores. Also, availability is not a concern. With geo-redundant storage, data can be replicated three times within one region and three times in another region hundreds of miles away. Flexible data such as web app user data, address books, device data and other metadata can be stored in table storage. You can also use table storage to build cloud applications without having to lock down the data model to specific schemas. Different rows can have different structures in the same table, so you can easily change your application and table schema without having to take it offline. Table storage embraces a strong consistency model.
20

Apache Kudu

The Apache Software Foundation

See Software

Kudu clusters store tables that look exactly like the tables in relational (SQL), databases. A table can have a single binary key and value or a multitude of strongly-typed attributes. Every table has a primary key that is made up of one or more columns, just like SQL. This could be a single column, such as a unique user ID, or a compound key, such as a (host.metric.timestamp) tuple to a machine-time-series database. Rows can be easily read, updated, and deleted by their primary keys. Kudu's data model is simple and easy to use. It makes it easy to port legacy applications and build new ones. You can use standard tools such as Spark or SQL engines to analyze your tables. Tables are self-describing. Kudu's APIs were designed to be simple to use.
21

Apache Parquet

The Apache Software Foundation

See Software

Parquet was created to provide the Hadoop ecosystem with the benefits of columnar, compressed data representation. Parquet was built with complex nested data structures and uses the Dremel paper's record shredding/assemblage algorithm. This approach is better than flattening nested namespaces. Parquet is designed to support efficient compression and encoding strategies. Multiple projects have shown the positive impact of the right compression and encoding scheme on data performance. Parquet allows for compression schemes to be specified per-column. It is future-proofed to allow for more encodings to be added as they are developed and implemented. Parquet was designed to be used by everyone. We don't want to play favorites in the Hadoop ecosystem.
22

Hypertable

Hypertable

See Software

Hypertable provides scalable database capacity at maximum speed to speed up big data applications and reduce your hardware footprint. Hypertable offers superior performance and efficiency over other competitors, which can translate into significant cost savings. It is a proven, scalable design that powers hundreds Google services. Open source brings all the benefits of open-source with a vibrant community. C++ implementation for optimal performance. Support for your business-critical big-data application is available 24/7/365 The employer of all core Hypertable developers provides unrivalled access to the Hypertable brain power. Hypertable was created to solve the scalability issue. This problem is not well handled by traditional RDBMSs. Hypertable is a Google-developed design that meets their scalability requirements. It solves the scale problem better then any other NoSQL solutions.
23

InfiniDB

Database of Databases

See Software

InfiniDB is a column-store DBMS that is optimized for OLAP workloads. It supports Massive Paralllel Processing (MPP) thanks to its distributed architecture. It uses MySQL as its front end so that MySQL-savvy users can migrate to InfiniDB quickly. Users can connect to InfiniDB with any MySQL connector. InfiniDB applies MVCC to do concurrency control. It uses the term System Change Number (SCN), to indicate a particular version of the system. It uses three structures in its Block Resolution Manager (BRM), version buffer, version substitution, and version buffer block manger, to manage multiple versions. InfiniDB applies deadlock detection to resolve conflicts. InfiniDB uses MySQL as its front end and supports all MySQL syntaxes including foreign keys. InfiniDB is a columnar DBMS. InfiniDB applies range partitioning to each column and stores the minimum and maximal values of each partition in a small structure called an extent map.
24

qikkDB

qikkDB

See Software

QikkDB is an GPU-accelerated columnar database that delivers outstanding performance for complex polygon operations as well as big data analytics. qikkDB is the best choice if you want to count your data in billions, and see real-time results. We are compatible with both Windows and Linux operating systems. Google Tests is our testing framework. The project contains hundreds of unit and tens integration tests. Microsoft Visual Studio 2019 is recommended for Windows development. Its dependencies include CUDA version 10.2 minimum, CMake 3.15 and newer, vcpkg., boost. The dependencies for Linux development are CUDA version 10.2 minimum, CMake 3.15 and newer, boost, and vcpkg. This project is licensed under Version 2.0 of the Apache License. To install qikkDB, you can use an installation script (or dockerfile).
25

Apache Pinot

Apache Corporation

See Software

Pinot is designed to answer OLAP questions with low latency and immutable data. Pluggable indexing technologies: Sorted Index (Bitmap Index), Inverted Index. Trino and PrestoDB are both available for querying, but joins are not currently supported. SQL-like language that supports selection and aggregation, filtering as well as group by, order, and distinct queries on data. Both an offline and a real-time table are possible. Only use real-time table to cover segments where offline data is not yet available. Customize anomaly detection flow and notification flow to detect the right anomalies.