Best Columnar Databases for Hadoop

Find and compare the best Columnar Databases for Hadoop in 2024

Use the comparison tool below to compare the top Columnar Databases for Hadoop on the market. You can filter results by user reviews, pricing, features, platform, region, support options, integrations, and more.

  • 1
    StarTree Reviews
    See Software
    Learn More
    StarTree Cloud is a fully-managed real-time analytics platform designed for OLAP at massive speed and scale for user-facing applications. Powered by Apache Pinot, StarTree Cloud provides enterprise-grade reliability and advanced capabilities such as tiered storage, scalable upserts, plus additional indexes and connectors. It integrates seamlessly with transactional databases and event streaming platforms, ingesting data at millions of events per second and indexing it for lightning-fast query responses. StarTree Cloud is available on your favorite public cloud or for private SaaS deployment. StarTree Cloud includes StarTree Data Manager, which allows you to ingest data from both real-time sources such as Amazon Kinesis, Apache Kafka, Apache Pulsar, or Redpanda, as well as batch data sources such as data warehouses like Snowflake, Delta Lake or Google BigQuery, or object stores like Amazon S3, Apache Flink, Apache Hadoop, or Apache Spark. StarTree ThirdEye is an add-on anomaly detection system running on top of StarTree Cloud that observes your business-critical metrics, alerting you and allowing you to perform root-cause analysis — all in real-time.
  • 2
    Apache Cassandra Reviews

    Apache Cassandra

    Apache Software Foundation

    1 Rating
    The Apache Cassandra database provides high availability and scalability without compromising performance. It is the ideal platform for mission-critical data because it offers linear scalability and demonstrated fault-tolerance with commodity hardware and cloud infrastructure. Cassandra's ability to replicate across multiple datacenters is first-in-class. This provides lower latency for your users, and the peace-of-mind that you can withstand regional outages.
  • 3
    Vertica Reviews
    The Unified Analytics Warehouse. The Unified Analytics Warehouse is the best place to find high-performing analytics and machine learning at large scale. Tech research analysts are seeing new leaders as they strive to deliver game-changing big data analytics. Vertica empowers data-driven companies so they can make the most of their analytics initiatives. It offers advanced time-series, geospatial, and machine learning capabilities, as well as data lake integration, user-definable extensions, cloud-optimized architecture and more. Vertica's Under the Hood webcast series allows you to dive into the features of Vertica - delivered by Vertica engineers, technical experts, and others - and discover what makes it the most scalable and scalable advanced analytical data database on the market. Vertica supports the most data-driven disruptors around the globe in their pursuit for industry and business transformation.
  • 4
    Google Cloud Bigtable Reviews
    Google Cloud Bigtable provides a fully managed, scalable NoSQL data service that can handle large operational and analytical workloads. Cloud Bigtable is fast and performant. It's the storage engine that grows with your data, from your first gigabyte up to a petabyte-scale for low latency applications and high-throughput data analysis. Seamless scaling and replicating: You can start with one cluster node and scale up to hundreds of nodes to support peak demand. Replication adds high availability and workload isolation to live-serving apps. Integrated and simple: Fully managed service that easily integrates with big data tools such as Dataflow, Hadoop, and Dataproc. Development teams will find it easy to get started with the support for the open-source HBase API standard.
  • 5
    Greenplum Reviews

    Greenplum

    Greenplum Database

    Greenplum Database®, an open-source data warehouse, is a fully featured, advanced, and fully functional data warehouse. It offers powerful and fast analytics on petabyte-scale data volumes. Greenplum Database is uniquely designed for big data analytics. It is powered by the most advanced cost-based query optimizer in the world, delivering high analytical query performance with large data volumes. The Apache 2 license is used to release Greenplum Database®. We would like to thank all of our community contributors. We are also open to new contributions. We encourage all contributions to the Greenplum Database community, no matter how small. Open-source, massively parallel data platform for machine learning, analytics, and AI. Rapidly create and deploy models to support complex applications in cybersecurity, predictive management, risk management, fraud detection, among other areas. The fully integrated, open-source analytics platform is now available.
  • 6
    Apache Kudu Reviews

    Apache Kudu

    The Apache Software Foundation

    Kudu clusters store tables that look exactly like the tables in relational (SQL), databases. A table can have a single binary key and value or a multitude of strongly-typed attributes. Every table has a primary key that is made up of one or more columns, just like SQL. This could be a single column, such as a unique user ID, or a compound key, such as a (host.metric.timestamp) tuple to a machine-time-series database. Rows can be easily read, updated, and deleted by their primary keys. Kudu's data model is simple and easy to use. It makes it easy to port legacy applications and build new ones. You can use standard tools such as Spark or SQL engines to analyze your tables. Tables are self-describing. Kudu's APIs were designed to be simple to use.
  • 7
    Apache Parquet Reviews

    Apache Parquet

    The Apache Software Foundation

    Parquet was created to provide the Hadoop ecosystem with the benefits of columnar, compressed data representation. Parquet was built with complex nested data structures and uses the Dremel paper's record shredding/assemblage algorithm. This approach is better than flattening nested namespaces. Parquet is designed to support efficient compression and encoding strategies. Multiple projects have shown the positive impact of the right compression and encoding scheme on data performance. Parquet allows for compression schemes to be specified per-column. It is future-proofed to allow for more encodings to be added as they are developed and implemented. Parquet was designed to be used by everyone. We don't want to play favorites in the Hadoop ecosystem.
  • 8
    Hypertable Reviews
    Hypertable provides scalable database capacity at maximum speed to speed up big data applications and reduce your hardware footprint. Hypertable offers superior performance and efficiency over other competitors, which can translate into significant cost savings. It is a proven, scalable design that powers hundreds Google services. Open source brings all the benefits of open-source with a vibrant community. C++ implementation for optimal performance. Support for your business-critical big-data application is available 24/7/365 The employer of all core Hypertable developers provides unrivalled access to the Hypertable brain power. Hypertable was created to solve the scalability issue. This problem is not well handled by traditional RDBMSs. Hypertable is a Google-developed design that meets their scalability requirements. It solves the scale problem better then any other NoSQL solutions.
  • 9
    Apache Pinot Reviews

    Apache Pinot

    Apache Corporation

    Pinot is designed to answer OLAP questions with low latency and immutable data. Pluggable indexing technologies: Sorted Index (Bitmap Index), Inverted Index. Trino and PrestoDB are both available for querying, but joins are not currently supported. SQL-like language that supports selection and aggregation, filtering as well as group by, order, and distinct queries on data. Both an offline and a real-time table are possible. Only use real-time table to cover segments where offline data is not yet available. Customize anomaly detection flow and notification flow to detect the right anomalies.
  • Previous
  • You're on page 1
  • Next