Best Presto Alternatives in 2024
Find the top alternatives to Presto currently available. Compare ratings, reviews, pricing, and features of Presto alternatives in 2024. Slashdot lists the best Presto alternatives on the market that offer competing products that are similar to Presto. Sort through Presto alternatives below to make the best choice for your needs
-
1
ANSI SQL allows you to analyze petabytes worth of data at lightning-fast speeds with no operational overhead. Analytics at scale with 26%-34% less three-year TCO than cloud-based data warehouse alternatives. You can unleash your insights with a trusted platform that is more secure and scales with you. Multi-cloud analytics solutions that allow you to gain insights from all types of data. You can query streaming data in real-time and get the most current information about all your business processes. Machine learning is built-in and allows you to predict business outcomes quickly without having to move data. With just a few clicks, you can securely access and share the analytical insights within your organization. Easy creation of stunning dashboards and reports using popular business intelligence tools right out of the box. BigQuery's strong security, governance, and reliability controls ensure high availability and a 99.9% uptime SLA. Encrypt your data by default and with customer-managed encryption keys
-
2
StarTree
StarTree
25 RatingsStarTree Cloud is a fully-managed real-time analytics platform designed for OLAP at massive speed and scale for user-facing applications. Powered by Apache Pinot, StarTree Cloud provides enterprise-grade reliability and advanced capabilities such as tiered storage, scalable upserts, plus additional indexes and connectors. It integrates seamlessly with transactional databases and event streaming platforms, ingesting data at millions of events per second and indexing it for lightning-fast query responses. StarTree Cloud is available on your favorite public cloud or for private SaaS deployment. StarTree Cloud includes StarTree Data Manager, which allows you to ingest data from both real-time sources such as Amazon Kinesis, Apache Kafka, Apache Pulsar, or Redpanda, as well as batch data sources such as data warehouses like Snowflake, Delta Lake or Google BigQuery, or object stores like Amazon S3, Apache Flink, Apache Hadoop, or Apache Spark. StarTree ThirdEye is an add-on anomaly detection system running on top of StarTree Cloud that observes your business-critical metrics, alerting you and allowing you to perform root-cause analysis — all in real-time. -
3
Apache Drill
The Apache Software Foundation
Schema-free SQL query engine for Hadoop, NoSQL, and Cloud Storage -
4
Amazon Redshift
Amazon
$0.25 per hourAmazon Redshift is preferred by more customers than any other cloud data storage. Redshift powers analytic workloads for Fortune 500 companies and startups, as well as everything in between. Redshift has helped Lyft grow from a startup to multi-billion-dollar enterprises. It's easier than any other data warehouse to gain new insights from all of your data. Redshift allows you to query petabytes (or more) of structured and semi-structured information across your operational database, data warehouse, and data lake using standard SQL. Redshift allows you to save your queries to your S3 database using open formats such as Apache Parquet. This allows you to further analyze other analytics services like Amazon EMR and Amazon Athena. Redshift is the fastest cloud data warehouse in the world and it gets faster each year. The new RA3 instances can be used for performance-intensive workloads to achieve up to 3x the performance compared to any cloud data warehouse. -
5
Apache Iceberg
Apache Software Foundation
FreeIceberg is an efficient format for large analytical tables. Iceberg brings the simplicity and reliability of SQL tables to the world of big data. It also allows engines like Spark, Trino Flink Presto Hive Impala and Impala to work safely with the same tables at the same time. Iceberg supports SQL commands that are flexible to merge new data, update rows, and perform targeted deletions. Iceberg can eagerly write data files to improve read performance or it can use delete-deltas for faster updates. Iceberg automates the tedious, error-prone process of generating partition values for each row in a table. It also skips unnecessary files and partitions. There are no extra filters needed for fast queries and the table layout is easily updated when data or queries change. -
6
Apache Druid
Druid
Apache Druid, an open-source distributed data store, is Apache Druid. Druid's core design blends ideas from data warehouses and timeseries databases to create a high-performance real-time analytics database that can be used for a wide range of purposes. Druid combines key characteristics from each of these systems into its ingestion, storage format, querying, and core architecture. Druid compresses and stores each column separately, so it only needs to read the ones that are needed for a specific query. This allows for fast scans, ranking, groupBys, and groupBys. Druid creates indexes that are inverted for string values to allow for fast search and filter. Connectors out-of-the box for Apache Kafka and HDFS, AWS S3, stream processors, and many more. Druid intelligently divides data based upon time. Time-based queries are much faster than traditional databases. Druid automatically balances servers as you add or remove servers. Fault-tolerant architecture allows for server failures to be avoided. -
7
Apache Pinot
Apache Corporation
Pinot is designed to answer OLAP questions with low latency and immutable data. Pluggable indexing technologies: Sorted Index (Bitmap Index), Inverted Index. Trino and PrestoDB are both available for querying, but joins are not currently supported. SQL-like language that supports selection and aggregation, filtering as well as group by, order, and distinct queries on data. Both an offline and a real-time table are possible. Only use real-time table to cover segments where offline data is not yet available. Customize anomaly detection flow and notification flow to detect the right anomalies. -
8
Apache Kylin
Apache Software Foundation
Apache Kylin™, an open-source distributed Analytical Data Warehouse (Big Data), was created to provide OLAP (Online Analytical Processing), in this big data era. Kylin can query at near constant speed regardless of increasing data volumes by renovating the multi-dimensional cube, precalculation technology on Hadoop or Spark, and thereby achieving almost constant query speed. Kylin reduces query latency from minutes down to a fraction of a second, bringing online analytics back into big data. Kylin can analyze more than 10+ billion rows in less time than a second. No more waiting for reports to make critical decisions. Kylin connects Hadoop data to BI tools such as Tableau, PowerBI/Excel and MSTR. This makes Hadoop BI faster than ever. Kylin is an Analytical Data Warehouse and offers ANSI SQL on Hadoop/Spark. It also supports most ANSI SQL queries functions. Because of the low resource consumption for each query, Kylin can support thousands upon thousands of interactive queries simultaneously. -
9
Amazon Athena
Amazon
2 RatingsAmazon Athena allows you to easily analyze data in Amazon S3 with standard SQL. Athena is serverless so there is no infrastructure to maintain and you only pay for the queries you run. Athena is simple to use. Simply point to your data in Amazon S3 and define the schema. Then, you can query standard SQL. Most results are delivered in a matter of seconds. Athena makes it easy to prepare your data for analysis without the need for complicated ETL jobs. Anyone with SQL skills can quickly analyze large-scale data sets. Athena integrates with AWS Glue Data Catalog out-of-the box. This allows you to create a unified metadata repositorie across multiple services, crawl data sources and discover schemas. You can also populate your Catalog by adding new and modified partition and table definitions. Schema versioning is possible. -
10
AtScale
AtScale
AtScale accelerates and simplifies business intelligence. This results in better business decisions and a faster time to insight. Reduce repetitive data engineering tasks such as maintaining, curating, and delivering data for analysis. To ensure consistent KPI reporting across BI tools, you can define business definitions in one place. You can speed up the time it takes to gain insight from data and also manage cloud compute costs efficiently. No matter where your data is located, you can leverage existing data security policies to perform data analytics. AtScale's Insights models and workbooks allow you to perform Cloud OLAP multidimensional analysis using data sets from multiple providers - without any data prep or engineering. To help you quickly gain insights that you can use to make business decisions, we provide easy-to-use dimensions and measures. -
11
Trino
Trino
FreeTrino is an engine that runs at incredible speeds. Fast-distributed SQL engine for big data analytics. Helps you explore the data universe. Trino is an extremely parallel and distributed query-engine, which is built from scratch for efficient, low latency analytics. Trino is used by the largest organizations to query data lakes with exabytes of data and massive data warehouses. Supports a wide range of use cases including interactive ad-hoc analysis, large batch queries that take hours to complete, and high volume apps that execute sub-second queries. Trino is a ANSI SQL query engine that works with BI Tools such as R Tableau Power BI Superset and many others. You can natively search data in Hadoop S3, Cassandra MySQL and many other systems without having to use complex, slow and error-prone copying processes. Access data from multiple systems in a single query. -
12
Denodo
Denodo Technologies
The core technology that enables modern data integration and data management. Connect disparate, structured and unstructured data sources quickly. Catalog your entire data ecosystem. The data is kept in the source and can be accessed whenever needed. Adapt data models to the consumer's needs, even if they come from multiple sources. Your back-end technologies can be hidden from end users. You can secure the virtual model and use it to consume standard SQL and other formats such as SOAP, REST, SOAP, and OData. Access to all types data is easy. Data integration and data modeling capabilities are available. Active Data Catalog and self service capabilities for data and metadata discovery and preparation. Full data security and governance capabilities. Data queries executed quickly and intelligently. Real-time data delivery in all formats. Data marketplaces can be created. Data-driven strategies can be made easier by separating business applications and data systems. -
13
Databricks Data Intelligence Platform
Databricks
The Databricks Data Intelligence Platform enables your entire organization to utilize data and AI. It is built on a lakehouse that provides an open, unified platform for all data and governance. It's powered by a Data Intelligence Engine, which understands the uniqueness in your data. Data and AI companies will win in every industry. Databricks can help you achieve your data and AI goals faster and easier. Databricks combines the benefits of a lakehouse with generative AI to power a Data Intelligence Engine which understands the unique semantics in your data. The Databricks Platform can then optimize performance and manage infrastructure according to the unique needs of your business. The Data Intelligence Engine speaks your organization's native language, making it easy to search for and discover new data. It is just like asking a colleague a question. -
14
VMware Tanzu Greenplum
Broadcom
Your apps can be freed. Reduce complexity in your operations. Software proficiency is essential to win in today's business world. How can you increase the feature velocity of the workloads that power your company? Or run and manage modernized workloads in any cloud. VMware Tanzu, when used with VMware Pivotal Labs, enables you to transform your team and your applications while simplifying operations across multicloud infrastructure: on-premises and public cloud. -
15
SingleStore
SingleStore
$0.69 per hour 1 RatingSingleStore (formerly MemSQL), is a distributed, highly-scalable SQL Database that can be run anywhere. With familiar relational models, we deliver the best performance for both transactional and analytical workloads. SingleStore is a scalable SQL database which continuously ingests data to perform operational analysis for your business' front lines. ACID transactions allow you to simultaneously process millions of events per second and analyze billions of rows in relational SQL, JSON geospatial, full-text search, and other formats. SingleStore provides the best data ingestion performance and supports batch loading and real-time data pipelines. SingleStore allows you to query live and historical data with ANSI SQL in a lightning fast manner. You can perform ad-hoc analysis using business intelligence tools, run machine-learning algorithms for real time scoring, and geoanalytic queries in a real time. -
16
ClickHouse
ClickHouse
1 RatingClickHouse is an open-source OLAP database management software that is fast and easy to use. It is column-oriented, and can generate real-time analytical reports by using SQL queries. ClickHouse's performance is superior to comparable column-oriented database management software currently on the market. It processes hundreds of millions of rows to more than a million and tens if not thousands of gigabytes per second. ClickHouse makes use of all hardware available to process every query as quickly as possible. Peak processing speed for a single query is more than 2 Terabytes per Second (after decompression, only utilized columns). To reduce latency, reads in distributed setups are automatically balanced between healthy replicas. ClickHouse supports multimaster asynchronous replication, and can be deployed across multiple datacenters. Each node is equal, which prevents single points of failure. -
17
StarRocks
StarRocks
FreeStarRocks offers at least 300% more performance than other popular solutions, whether you're using a single or multiple tables. With a rich set connectors, you can ingest real-time data into StarRocks for the latest insights. A query engine that adapts your use cases. StarRocks allows you to scale your analytics easily without moving your data or rewriting SQL. StarRocks allows a rapid journey between data and insight. StarRocks is unmatched in performance and offers a unified OLAP system that covers the most common data analytics scenarios. StarRocks offers at least 300% faster performance than other popular solutions, whether you are working with one table or many. StarRocks' built-in memory-and-disk-based caching framework is specifically designed to minimize the I/O overhead of fetching data from external storage to accelerate query performance. -
18
Infobright DB
IgniteTech
InfobrightDB is a high performance enterprise database that leverages a columnar storage engine to allow business analysts to quickly and efficiently dissect data and obtain reports. InfoBrightDB can be deployed either on-premises or in the cloud. -
19
SAP HANA
SAP
SAP HANA is an in-memory database with high performance that accelerates data-driven decision-making and actions. It supports all workloads and provides the most advanced analytics on multi-model data on premise and in cloud. -
20
SSuite MonoBase Database
SSuite Office Software
FreeYou can create flat or relational databases with unlimited fields, tables, and rows. A custom report builder is included. Create custom reports by connecting to compatible ODBC databases. You can create your own databases. Here are some highlights: Filter tables instantly - Ultra simple graphical-user-interface - One-click table and data form creation - You can open up to 5 databases simultaneously Export your data to comma-separated files - Create custom reports to all your databases - A complete helpfile for creating database reports - You can print tables and queries directly from your data grid - Supports any SQL standard your ODBC compatible databases require For best performance and user experience, please install and run this database app with full administrator rights. Requirements: . 1024x768 Display Size . Windows 98 / XP / Windows 8 / Windows 10 - 32bit or 64bit No Java or DotNet are required. Green Energy Software. One step at a time, saving the planet -
21
VeloDB
VeloDB
VeloDB, powered by Apache Doris is a modern database for real-time analytics at scale. In seconds, micro-batch data can be ingested using a push-based system. Storage engine with upserts, appends and pre-aggregations in real-time. Unmatched performance in real-time data service and interactive ad hoc queries. Not only structured data, but also semi-structured. Not only real-time analytics, but also batch processing. Not only run queries against internal data, but also work as an federated query engine to access external databases and data lakes. Distributed design to support linear scalability. Resource usage can be adjusted flexibly to meet workload requirements, whether on-premise or cloud deployment, separation or integration. Apache Doris is fully compatible and built on this open source software. Support MySQL functions, protocol, and SQL to allow easy integration with other tools. -
22
Qubole
Qubole
Qubole is an open, secure, and simple Data Lake Platform that enables machine learning, streaming, or ad-hoc analysis. Our platform offers end-to-end services to reduce the time and effort needed to run Data pipelines and Streaming Analytics workloads on any cloud. Qubole is the only platform that offers more flexibility and openness for data workloads, while also lowering cloud data lake costs up to 50%. Qubole provides faster access to trusted, secure and reliable datasets of structured and unstructured data. This is useful for Machine Learning and Analytics. Users can efficiently perform ETL, analytics, or AI/ML workloads in an end-to-end fashion using best-of-breed engines, multiple formats and libraries, as well as languages that are adapted to data volume and variety, SLAs, and organizational policies. -
23
MonetDB
MonetDB
Choose from a wide range of SQL features to realise your applications from pure analytics to hybrid transactional/analytical processing. MonetDB returns queries in seconds, if not faster, when you are curious about your data and when you need to work efficiently. You can (re)use your code when you need specialised function: Use the hooks to add your user-defined functions to SQL, Python R, C/C++, or R. Join us to expand the MonetDB community that spans 130+ countries. We have students, teachers, researchers and small businesses. Join the most important Database in Analytical Jobs to surf the innovation! MonetDB's simple setup will quickly get your DBMS up to speed. -
24
IBM Db2
IBM
IBM Db2®, a family of hybrid data management tools, offers a complete suite AI-empowered capabilities to help you manage structured and unstructured data both on premises and in private and public clouds. Db2 is built upon an intelligent common SQL engine that allows for flexibility and scalability. -
25
HEAVY.AI
HEAVY.AI
HEAVY.AI is a pioneer in accelerated analysis. The HEAVY.AI platform can be used by government and business to uncover insights in data that is beyond the reach of traditional analytics tools. The platform harnesses the huge parallelism of modern CPU/GPU hardware and is available both in the cloud or on-premise. HEAVY.AI was developed from research at Harvard and MIT Computer Science and Artificial Intelligence Laboratory. You can go beyond traditional BI and GIS and extract high-quality information from large datasets with no lag by leveraging modern GPU and CPU hardware. To get a complete picture of what, when and where, unify and explore large geospatial or time-series data sets. Combining interactive visual analytics, hardware accelerated SQL, advanced analytics & data sciences frameworks, you can find the opportunity and risk in your enterprise when it matters most. -
26
Teradata Vantage
Teradata
Businesses struggle to find answers as data volumes increase faster than ever. Teradata Vantage™, solves this problem. Vantage uses 100 per cent of the data available to uncover real-time intelligence at scale. This is the new era in Pervasive Data Intelligence. All data across the organization is available in one place. You can access it whenever you need it using preferred languages and tools. Start small and scale up compute or storage to areas that have an impact on modern architecture. Vantage unifies analytics and data lakes in the cloud to enable business intelligence. Data is growing. Business intelligence is becoming more important. Four key issues that can lead to frustration when using existing data analysis platforms include: Lack of the right tools and supportive environment required to achieve quality results. Organizations don't allow or give proper access to the tools they need. It is difficult to prepare data. -
27
Archon Data Store
Platform 3 Solutions
Archon Data Store™ is an open-source archive lakehouse platform that allows you to store, manage and gain insights from large volumes of data. Its minimal footprint and compliance features enable large-scale processing and analysis of structured and unstructured data within your organization. Archon Data Store combines data warehouses, data lakes and other features into a single platform. This unified approach eliminates silos of data, streamlining workflows in data engineering, analytics and data science. Archon Data Store ensures data integrity through metadata centralization, optimized storage, and distributed computing. Its common approach to managing data, securing it, and governing it helps you innovate faster and operate more efficiently. Archon Data Store is a single platform that archives and analyzes all of your organization's data, while providing operational efficiencies. -
28
Databend
Databend
FreeDatabend is an agile, cloud-native, modern data warehouse that delivers high-performance analytics at a low cost for large-scale data processing. It has an elastic architecture which scales dynamically in order to meet the needs of different workloads. This ensures efficient resource utilization and lower operating costs. Databend, written in Rust offers exceptional performance thanks to features such as vectorized query execution, columnar storage and optimized data retrieval and processing speed. Its cloud-first approach allows for seamless integration with cloud platforms and emphasizes reliability, consistency of data, and fault tolerance. Databend is a free and open-source solution that makes it an accessible and flexible choice for data teams who want to handle big data analysis in the cloud. -
29
Starburst Enterprise
Starburst Data
Starburst allows you to make better decisions by having quick access to all of your data. Your company has more data than ever, but your data teams are still waiting to analyze it. Starburst gives your data teams quick and accurate access to more data. Starburst Enterprise, a fully supported, production-tested, enterprise-grade distribution for open source Trino (formerly Presto®, SQL), is now available. It increases performance and security, while making it easy for you to deploy, connect, manage, and manage your Trino environment. Starburst allows your team to connect to any source of data, whether it's on-premise, in a cloud, or across a hybrid cloud environment. This allows them to use the analytics tools they already love and access data that lives anywhere. -
30
IBM Cloud SQL Query
IBM
$5.00/Terabyte-Month Interactive querying that is serverless for analyzing data stored in IBM Cloud Object Storage. You can query your data right where it is stored - there are no ETL, databases or infrastructure to manage. -
31
Apache Impala
Apache
FreeImpala offers low latency, high concurrency, and a wide range of storage options, including Iceberg and open data formats. Impala scales linearly in multitenant environments. Impala integrates native Hadoop security, Kerberos authentication, and the Ranger module to ensure that the correct users and applications have access to the right data. Utilize the same file and data formats and metadata, security, and resource management frameworks as your Hadoop deployment, with no redundant infrastructure or data conversion/duplication. Impala uses the same metadata driver and ODBC driver as Apache Hive. Impala, like Hive, supports SQL. You don't need to reinvent the wheel. Impala allows more users to interact with data, whether they are using SQL queries or BI apps, through a single repository. Metadata is also stored from the source of the data until it has been analyzed. -
32
PySpark
PySpark
PySpark is a Python interface for Apache Spark. It allows you to create Spark applications using Python APIs. Additionally, it provides the PySpark shell that allows you to interactively analyze your data in a distributed environment. PySpark supports Spark's most popular features, including Spark SQL, DataFrame and Streaming. Spark SQL is a Spark module that allows structured data processing. It can be used as a distributed SQL query engine and a programming abstraction called DataFrame. The streaming feature in Apache Spark, which runs on top of Spark allows for powerful interactive and analytic applications across streaming and historical data. It also inherits Spark's ease-of-use and fault tolerance characteristics. -
33
LlamaIndex
LlamaIndex
LlamaIndex, a "dataframework", is designed to help you create LLM apps. Connect semi-structured API data like Slack or Salesforce. LlamaIndex provides a flexible and simple data framework to connect custom data sources with large language models. LlamaIndex is a powerful tool to enhance your LLM applications. Connect your existing data formats and sources (APIs, PDFs, documents, SQL etc.). Use with a large-scale language model application. Store and index data for different uses. Integrate downstream vector stores and database providers. LlamaIndex is a query interface which accepts any input prompts over your data, and returns a knowledge augmented response. Connect unstructured data sources, such as PDFs, raw text files and images. Integrate structured data sources such as Excel, SQL etc. It provides ways to structure data (indices, charts) so that it can be used with LLMs. -
34
Apache Spark
Apache Software Foundation
Apache Spark™, a unified analytics engine that can handle large-scale data processing, is available. Apache Spark delivers high performance for streaming and batch data. It uses a state of the art DAG scheduler, query optimizer, as well as a physical execution engine. Spark has over 80 high-level operators, making it easy to create parallel apps. You can also use it interactively via the Scala, Python and R SQL shells. Spark powers a number of libraries, including SQL and DataFrames and MLlib for machine-learning, GraphX and Spark Streaming. These libraries can be combined seamlessly in one application. Spark can run on Hadoop, Apache Mesos and Kubernetes. It can also be used standalone or in the cloud. It can access a variety of data sources. Spark can be run in standalone cluster mode on EC2, Hadoop YARN and Mesos. Access data in HDFS and Alluxio. -
35
Dremio
Dremio
Dremio provides lightning-fast queries as well as a self-service semantic layer directly to your data lake storage. No data moving to proprietary data warehouses, and no cubes, aggregation tables, or extracts. Data architects have flexibility and control, while data consumers have self-service. Apache Arrow and Dremio technologies such as Data Reflections, Columnar Cloud Cache(C3), and Predictive Pipelining combine to make it easy to query your data lake storage. An abstraction layer allows IT to apply security and business meaning while allowing analysts and data scientists access data to explore it and create new virtual datasets. Dremio's semantic layers is an integrated searchable catalog that indexes all your metadata so business users can make sense of your data. The semantic layer is made up of virtual datasets and spaces, which are all searchable and indexed. -
36
Greenplum
Greenplum Database
Greenplum Database®, an open-source data warehouse, is a fully featured, advanced, and fully functional data warehouse. It offers powerful and fast analytics on petabyte-scale data volumes. Greenplum Database is uniquely designed for big data analytics. It is powered by the most advanced cost-based query optimizer in the world, delivering high analytical query performance with large data volumes. The Apache 2 license is used to release Greenplum Database®. We would like to thank all of our community contributors. We are also open to new contributions. We encourage all contributions to the Greenplum Database community, no matter how small. Open-source, massively parallel data platform for machine learning, analytics, and AI. Rapidly create and deploy models to support complex applications in cybersecurity, predictive management, risk management, fraud detection, among other areas. The fully integrated, open-source analytics platform is now available. -
37
Ascend
Ascend
$0.98 per DFCAscend provides data teams with a unified platform that allows them to ingest and transform their data and create and manage their analytics engineering and data engineering workloads. Ascend is supported by DataAware intelligence. Ascend works in the background to ensure data integrity and optimize data workloads, which can reduce maintenance time by up to 90%. Ascend's multilingual flex-code interface allows you to use SQL, Java, Scala, and Python interchangeably. Quickly view data lineage and data profiles, job logs, system health, system health, and other important workload metrics at a glance. Ascend provides native connections to a growing number of data sources using our Flex-Code data connectors. -
38
Imply
Imply
Imply is a real time analytics platform built on Apache Druid. It was designed to handle large scale, high performance OLAP (Online Analytical Processing). It provides real-time data ingestion and fast query performance. It also allows for complex analytical queries to be performed on massive datasets at low latency. Imply is designed for organizations who need interactive analytics, real time dashboards, and data driven decision-making. It offers a user-friendly data exploration interface, as well as advanced features like multi-tenancy and fine-grained controls for access. Imply's distributed architecture and scalability make it ideal for use cases such as streaming data analytics, real-time monitoring, and business intelligence. -
39
Citus
Citus Data
$0.27 per hourCitus combines the Postgres you know and love with the power of distributed tables. 100% open source. Now with schema-based, row-based, and Postgres 16 support. Scale Postgres using data and queries. You can start off with a single Citus server, then add more nodes and rebalance the shards as you grow. Parallelism, storing more data in memory and using higher I/O bandwidth along with columnar compression can speed up queries by 20x or 300x. Citus is a new extension (not fork) of the latest Postgres version, allowing you to use your familiar SQL toolkit and leverage your Postgres expertise. Use a single database to manage both your analytical and transactional workloads. Download and use Citus Open Source for free. Citus can be managed by you, if you embrace open source and use GitHub. Focus on your app & forget your database. Azure Cosmos DB PostgreSQL for Citus allows you to run your app in the cloud. -
40
Hydra
Hydra
Hydra is a column-oriented Postgres that is open source. No code changes required to query billions of rows. Hydra parallelizes, vectorizes, and aggregates (COUNTS, SUMs, AVGs) to deliver the speed that you've always desired on Postgres. Boost performance in every size! Hydra can be installed in just 5 minutes, without requiring any changes to your tools, extensions, syntax, data model or data model. Hydra Cloud allows for smooth sailing and fully managed operations. Different industries have different requirements. Take control of your analytics with powerful Postgres custom functions and extensions. Built by you for you. Hydra is the fastest Postgres on the market. Boost performance by using columnar storage, query parallelization, and vectorization. - 41
-
42
IBM Db2 Big SQL
IBM
A hybrid SQL-onHadoop engine that delivers advanced, security-rich data queries across enterprise big data sources including Hadoop object storage and data warehouses. IBM Db2 Big SQL, an enterprise-grade, hybrid ANSI compliant SQL-on-Hadoop engine that delivers massively parallel processing and advanced data query, is available. Db2 Big SQL allows you to connect to multiple sources, such as Hadoop HDFS and WebHDFS. RDMS, NoSQL database, object stores, and RDMS. You can benefit from low latency, high speed, data security, SQL compatibility and federation capabilities to perform complex and ad-hoc queries. Db2 Big SQL now comes in two versions. It can be integrated with Cloudera Data Platform or accessed as a cloud native service on the IBM Cloud Pak®. for Data platform. Access, analyze, and perform queries on real-time and batch data from multiple sources, including Hadoop, object stores, and data warehouses. -
43
Amazon Aurora
Amazon
$0.02 per month 1 RatingAmazon Aurora is a MySQL and PostgreSQL-compatible relational database built for the cloud, that combines the performance and availability of traditional enterprise databases with the simplicity and cost-effectiveness of open source databases. Amazon Aurora is five times faster that standard MySQL databases and three time faster than standard PostgreSQL database. It offers the same security, availability, reliability, and cost-effectiveness as commercial databases, but at a fraction of the cost. Amazon Aurora is fully managed and maintained by Amazon Relational Database Service, (RDS). This automates tedious administration tasks such as hardware provisioning, database setup, patching and backups. Amazon Aurora is a distributed, fault-tolerant and self-healing storage that auto-scales up 64TB per database instance. It offers high availability and performance with up to 15 low latency read replicas, point in time recovery, continuous backup to Amazon S3, replication across threeAvailability Zones, and continuous backup to Amazon S3. -
44
CockroachDB
Cockroach Labs
1 RatingCockroachDB: Cloud-native distributed SQL. Your cloud applications deserve a cloud-native database. Cloud-based apps and services need a database that can scale across clouds, reduces operational complexity, and improves reliability. CockroachDB provides resilient, distributed SQL with ACID transactions. Data partitioned by geography is also available. Combining CockroachDB and orchestration tools such as Mesosphere DC/OS and Kubernetes to automate mission-critical applications can speed up operations. -
45
Apache Hive
Apache Software Foundation
1 RatingApache Hive™, a data warehouse software, facilitates the reading, writing and management of large datasets that are stored in distributed storage using SQL. Structure can be projected onto existing data. Hive provides a command line tool and a JDBC driver to allow users to connect to it. Apache Hive is an Apache Software Foundation open-source project. It was previously a subproject to Apache® Hadoop®, but it has now become a top-level project. We encourage you to read about the project and share your knowledge. To execute traditional SQL queries, you must use the MapReduce Java API. Hive provides the SQL abstraction needed to integrate SQL-like query (HiveQL), into the underlying Java. This is in addition to the Java API that implements queries. -
46
Motif Analytics
Motif Analytics
Rich interactive visualizations to identify patterns in user and company flows with full visibility of computation. In less than 10 lines of code, a small set of sequence operators can provide full expressivity and finely-grained control. A query engine that allows you to trade between query speed, precision and cost according your needs. Motif currently uses a custom-built DSL, called Sequence Operations Language. We believe it is more natural than SQL and more powerful that a drag-and drop interface. We built a custom algorithm to optimize sequence queries. We also trade off precision for query speed, which is not used in decision-making. -
47
Delta Lake
Delta Lake
Delta Lake is an open-source storage platform that allows ACID transactions to Apache Spark™, and other big data workloads. Data lakes often have multiple data pipelines that read and write data simultaneously. This makes it difficult for data engineers to ensure data integrity due to the absence of transactions. Your data lakes will benefit from ACID transactions with Delta Lake. It offers serializability, which is the highest level of isolation. Learn more at Diving into Delta Lake - Unpacking the Transaction log. Even metadata can be considered "big data" in big data. Delta Lake treats metadata the same as data and uses Spark's distributed processing power for all its metadata. Delta Lake is able to handle large tables with billions upon billions of files and partitions at a petabyte scale. Delta Lake allows developers to access snapshots of data, allowing them to revert to earlier versions for audits, rollbacks, or to reproduce experiments. -
48
Apache Doris
The Apache Software Foundation
FreeApache Doris is an advanced data warehouse for real time analytics. It delivers lightning fast analytics on real-time, large-scale data. Ingestion of micro-batch data and streaming data within a second. Storage engine with upserts, appends and pre-aggregations in real-time. Optimize for high-concurrency, high-throughput queries using columnar storage engine, cost-based query optimizer, and vectorized execution engine. Federated querying for data lakes like Hive, Iceberg, and Hudi and databases like MySQL and PostgreSQL. Compound data types, such as Arrays, Maps and JSON. Variant data types to support auto datatype inference for JSON data. NGram bloomfilter for text search. Distributed design for linear scaling. Workload isolation, tiered storage and efficient resource management. Supports shared-nothing as well as the separation of storage from compute. -
49
Lumenore Business Intelligence with no-code analytics. Get actionable intelligence that’s connected to your data - wherever it’s coming from. Next-generation business intelligence and analytics platform. We embrace change every day and strive to push the boundaries of technology and innovation to do more, do things differently, and, most importantly, to provide people and companies with the right insight in the most efficient way. In just a few clicks, transform huge amounts of raw data into actionable information. This program was designed with the user in mind.
-
50
Molecula
Molecula
Molecula, an enterprise feature store, simplifies, speeds up, and controls big-data access to power machine scale analytics and AI. Continuously extracting features and reducing the data dimensionality at the source allows for millisecond queries, computations, and feature re-use across formats without copying or moving any raw data. The Molecula feature storage provides data engineers, data scientists and application developers with a single point of access to help them move from reporting and explaining with human scale data to predicting and prescribing business outcomes. Enterprises spend a lot of time preparing, aggregating and making multiple copies of their data before they can make any decisions with it. Molecula offers a new paradigm for continuous, real time data analysis that can be used for all mission-critical applications.