Apache Hudi Description

Hudi is a rich platform for building streaming data lakes using incremental data pipelines on a self managing database layer. It can also be optimized for regular batch processing and lake engines. Hudi keeps a timeline of all actions on the table at different times. This allows for instantaneous views and efficient retrieval of data in the order they were received. The following components make up a Hudi instant. Hudi provides efficient upserts by mapping a given Hoodie key consistently with a file ID, via an indexing mechanism. Once a record is written to a file, the mapping between record key/file group/file ID never changes. The mapped file group includes all versions of a group record.

Integrations

Reviews

Total
ease
features
design
support

No User Reviews. Be the first to provide a review:

Write a Review

Company Details

Company:
Apache Corporation
Year Founded:
1954
Headquarters:
United States
Website:
hudi.apache.org

Media

Apache Hudi Screenshot 1
Recommended Products
Red Hat Enterprise Linux on Microsoft Azure Icon
Red Hat Enterprise Linux on Microsoft Azure

Deploy Red Hat Enterprise Linux on Microsoft Azure for a secure, reliable, and scalable cloud environment, fully integrated with Microsoft services.

Red Hat Enterprise Linux (RHEL) on Microsoft Azure provides a secure, reliable, and flexible foundation for your cloud infrastructure. Red Hat Enterprise Linux on Microsoft Azure is ideal for enterprises seeking to enhance their cloud environment with seamless integration, consistent performance, and comprehensive support.

Product Details

Platforms
SaaS
Type of Training
Documentation
Customer Support
Online

Apache Hudi Features and Options

Data Warehouse Software

Ad hoc Query
Analytics
Data Integration
Data Migration
Data Quality Control
ETL - Extract / Transfer / Load
In-Memory Processing
Match & Merge