Business Software for Apache Airflow

Top Software that integrates with Apache Airflow

  • 1
    Stonebranch Reviews
    See Software
    Learn More
    Stonebranch’s Universal Automation Center (UAC) is a Hybrid IT automation platform, offering real-time management of tasks and processes within hybrid IT settings, encompassing both on-premises and cloud environments. As a versatile software platform, UAC streamlines and coordinates your IT and business operations, while ensuring the secure administration of file transfers and centralizing IT job scheduling and automation solutions. Powered by event-driven automation technology, UAC empowers you to achieve instantaneous automation throughout your entire hybrid IT landscape. Enjoy real-time hybrid IT automation for diverse environments, including cloud, mainframe, distributed, and hybrid setups. Experience the convenience of Managed File Transfers (MFT) automation, effortlessly managing and orchestrating file transfers between mainframes and systems, seamlessly connecting with AWS or Azure cloud services.
  • 2
    DataBuck Reviews
    See Software
    Learn More
    Big Data Quality must always be verified to ensure that data is safe, accurate, and complete. Data is moved through multiple IT platforms or stored in Data Lakes. The Big Data Challenge: Data often loses its trustworthiness because of (i) Undiscovered errors in incoming data (iii). Multiple data sources that get out-of-synchrony over time (iii). Structural changes to data in downstream processes not expected downstream and (iv) multiple IT platforms (Hadoop DW, Cloud). Unexpected errors can occur when data moves between systems, such as from a Data Warehouse to a Hadoop environment, NoSQL database, or the Cloud. Data can change unexpectedly due to poor processes, ad-hoc data policies, poor data storage and control, and lack of control over certain data sources (e.g., external providers). DataBuck is an autonomous, self-learning, Big Data Quality validation tool and Data Matching tool.
  • 3
    Coursebox AI Reviews

    Coursebox AI

    Coursebox

    $99 per month
    60 Ratings
    See Software
    Learn More
    Empower your content transformation with Coursebox, the leading AI-driven eLearning authoring tool. Our platform streamlines the course development process, enabling you to create a well-structured course in a matter of seconds. Once the foundation is set, you can easily refine the content and add any final touches before it's ready for deployment. Whether you're looking to distribute your course privately, sell it to a broader audience, or integrate it into your existing LMS, Coursebox makes it effortless. Designed with a mobile-first approach, Coursebox ensures that your learners stay engaged and motivated through rich, interactive content—complete with videos, quizzes, and other dynamic elements. Leverage our branded learning management system, featuring native mobile apps, to deliver a seamless learning experience. With options for custom hosting and domain personalization, Coursebox offers flexibility to meet your specific needs. Ideal for both organizations and individual educators, Coursebox simplifies the management and segmentation of learners, allowing you to craft personalized learning paths and scale your training programs quickly and efficiently.
  • 4
    Netdata Reviews
    Top Pick
    Monitor your servers, containers, and applications, in high-resolution and in real-time. Netdata collects metrics per second and presents them in beautiful low-latency dashboards. It is designed to run on all of your physical and virtual servers, cloud deployments, Kubernetes clusters, and edge/IoT devices, to monitor your systems, containers, and applications. It scales nicely from just a single server to thousands of servers, even in complex multi/mixed/hybrid cloud environments, and given enough disk space it can keep your metrics for years. KEY FEATURES: Collects metrics from 800+ integrations Real-Time, Low-Latency, High-Resolution Unsupervised Anomaly Detection Powerful Visualization Out of box Alerts systemd Journal Logs Explorer Low Maintenance Open and Extensible Troubleshoot slowdowns and anomalies in your infrastructure with thousands of per-second metrics, meaningful visualisations, and insightful health alarms with zero configuration. Netdata is different. Real-Time data collection and visualization. Infinite scalability baked into its design. Flexible and extremely modular. Immediately available for troubleshooting, requiring zero prior knowledge and preparation.
  • 5
    Sifflet Reviews
    Effortlessly monitor thousands of tables through machine learning-driven anomaly detection alongside a suite of over 50 tailored metrics. Ensure comprehensive oversight of both data and metadata while meticulously mapping all asset dependencies from ingestion to business intelligence. This solution enhances productivity and fosters collaboration between data engineers and consumers. Sifflet integrates smoothly with your existing data sources and tools, functioning on platforms like AWS, Google Cloud Platform, and Microsoft Azure. Maintain vigilance over your data's health and promptly notify your team when quality standards are not satisfied. With just a few clicks, you can establish essential coverage for all your tables. Additionally, you can customize the frequency of checks, their importance, and specific notifications simultaneously. Utilize machine learning-driven protocols to identify any data anomalies with no initial setup required. Every rule is supported by a unique model that adapts based on historical data and user input. You can also enhance automated processes by utilizing a library of over 50 templates applicable to any asset, thereby streamlining your monitoring efforts even further. This approach not only simplifies data management but also empowers teams to respond proactively to potential issues.
  • 6
    Microsoft Purview Reviews
    Microsoft Purview serves as a comprehensive data governance platform that facilitates the management and oversight of your data across on-premises, multicloud, and software-as-a-service (SaaS) environments. With its capabilities in automated data discovery, sensitive data classification, and complete data lineage tracking, you can effortlessly develop a thorough and current representation of your data ecosystem. This empowers data users to access reliable and valuable data easily. The service provides automated identification of data lineage and classification across various sources, ensuring a cohesive view of your data assets and their interconnections for enhanced governance. Through semantic search, users can discover data using both business and technical terminology, providing insights into the location and flow of sensitive information within a hybrid data environment. By leveraging the Purview Data Map, you can lay the groundwork for effective data utilization and governance, while also automating and managing metadata from diverse sources. Additionally, it supports the classification of data using both predefined and custom classifiers, along with Microsoft Information Protection sensitivity labels, ensuring that your data governance framework is robust and adaptable. This combination of features positions Microsoft Purview as an essential tool for organizations seeking to optimize their data management strategies.
  • 7
    Ray Reviews

    Ray

    Anyscale

    Free
    You can develop on your laptop, then scale the same Python code elastically across hundreds or GPUs on any cloud. Ray converts existing Python concepts into the distributed setting, so any serial application can be easily parallelized with little code changes. With a strong ecosystem distributed libraries, scale compute-heavy machine learning workloads such as model serving, deep learning, and hyperparameter tuning. Scale existing workloads (e.g. Pytorch on Ray is easy to scale by using integrations. Ray Tune and Ray Serve native Ray libraries make it easier to scale the most complex machine learning workloads like hyperparameter tuning, deep learning models training, reinforcement learning, and training deep learning models. In just 10 lines of code, you can get started with distributed hyperparameter tune. Creating distributed apps is hard. Ray is an expert in distributed execution.
  • 8
    Dagster Reviews

    Dagster

    Dagster Labs

    $0
    Dagster is the cloud-native open-source orchestrator for the whole development lifecycle, with integrated lineage and observability, a declarative programming model, and best-in-class testability. It is the platform of choice data teams responsible for the development, production, and observation of data assets. With Dagster, you can focus on running tasks, or you can identify the key assets you need to create using a declarative approach. Embrace CI/CD best practices from the get-go: build reusable components, spot data quality issues, and flag bugs early.
  • 9
    Oxla Reviews

    Oxla

    Oxla

    $50 per CPU core / monthly
    Designed specifically for optimizing compute, memory, and storage, Oxla serves as a self-hosted data warehouse that excels in handling large-scale, low-latency analytics while providing strong support for time-series data. While cloud data warehouses may suit many, they are not universally applicable; as operations expand, the ongoing costs of cloud computing can surpass initial savings on infrastructure, particularly in regulated sectors that demand comprehensive data control beyond mere VPC and BYOC setups. Oxla surpasses both traditional and cloud-based warehouses by maximizing efficiency, allowing for the scalability of expanding datasets with predictable expenses, whether on-premises or in various cloud environments. Deployment, execution, and maintenance of Oxla can be easily managed using Docker and YAML, enabling a range of workloads to thrive within a singular, self-hosted data warehouse. In this way, Oxla provides a tailored solution for organizations seeking both efficiency and control in their data management strategies.
  • 10
    intermix.io Reviews

    intermix.io

    Intermix.io

    $295 per month
    Gather metadata from your data warehouse along with associated tools to monitor the workloads that are important to you, enabling a retrospective analysis of user interaction, expenses, and the efficiency of your data products. Achieve comprehensive insight into your data ecosystem, including who interacts with your data and the methods of usage. In our discussions, we highlight how various data teams successfully develop and implement data products within their organizations. We delve into technological frameworks, best practices, and valuable insights gained along the way. intermix.io offers a seamless solution for obtaining complete visibility through an intuitive SaaS dashboard. You can engage with your entire team, generate tailored reports, and access all necessary information to grasp the dynamics of your data platform, including your cloud data warehouse and its connected tools. intermix.io simplifies the process of collecting metadata from your data warehouse without requiring any coding skills. Importantly, we do not need to access any data stored within your data warehouse, ensuring your information remains secure while you focus on maximizing its potential. This approach not only enhances data governance but also empowers teams to make informed decisions based on accurate and timely data insights.
  • 11
    IRI FieldShield Reviews

    IRI FieldShield

    IRI, The CoSort Company

    IRI FieldShield® is a powerful and affordable data discovery and de-identification package for masking PII, PHI, PAN and other sensitive data in structured and semi-structured sources. Front-ended in a free Eclipse-based design environment, FieldShield jobs classify, profile, scan, and de-identify data at rest (static masking). Use the FieldShield SDK or proxy-based application to secure data in motion (dynamic data masking). The usual method for masking RDB and other flat files (CSV, Excel, LDIF, COBOL, etc.) is to classify it centrally, search for it globally, and automatically mask it in a consistent way using encryption, pseudonymization, redaction or other functions to preserve realism and referential integrity in production or test environments. Use FieldShield to make test data, nullify breaches, or comply with GDPR. HIPAA. PCI, PDPA, PCI-DSS and other laws. Audit through machine- and human-readable search reports, job logs and re-ID risks scores. Optionally mask data when you map it; FieldShield functions can also run in IRI Voracity ETL and federation, migration, replication, subsetting, and analytic jobs. To mask DB clones run FieldShield in Windocks, Actifio or Commvault. Call it from CI/CD pipelines and apps.
  • 12
    Prophecy Reviews

    Prophecy

    Prophecy

    $299 per month
    Prophecy expands accessibility for a wider range of users, including visual ETL developers and data analysts, by allowing them to easily create pipelines through a user-friendly point-and-click interface combined with a few SQL expressions. While utilizing the Low-Code designer to construct workflows, you simultaneously generate high-quality, easily readable code for Spark and Airflow, which is then seamlessly integrated into your Git repository. The platform comes equipped with a gem builder, enabling rapid development and deployment of custom frameworks, such as those for data quality, encryption, and additional sources and targets that enhance the existing capabilities. Furthermore, Prophecy ensures that best practices and essential infrastructure are offered as managed services, simplifying your daily operations and overall experience. With Prophecy, you can achieve high-performance workflows that leverage the cloud's scalability and performance capabilities, ensuring that your projects run efficiently and effectively. This powerful combination of features makes it an invaluable tool for modern data workflows.
  • 13
    BentoML Reviews
    Deploy your machine learning model in the cloud within minutes using a consolidated packaging format that supports both online and offline operations across various platforms. Experience a performance boost with throughput that is 100 times greater than traditional flask-based model servers, achieved through our innovative micro-batching technique. Provide exceptional prediction services that align seamlessly with DevOps practices and integrate effortlessly with widely-used infrastructure tools. The unified deployment format ensures high-performance model serving while incorporating best practices for DevOps. This service utilizes the BERT model, which has been trained with the TensorFlow framework to effectively gauge the sentiment of movie reviews. Our BentoML workflow eliminates the need for DevOps expertise, automating everything from prediction service registration to deployment and endpoint monitoring, all set up effortlessly for your team. This creates a robust environment for managing substantial ML workloads in production. Ensure that all models, deployments, and updates are easily accessible and maintain control over access through SSO, RBAC, client authentication, and detailed auditing logs, thereby enhancing both security and transparency within your operations. With these features, your machine learning deployment process becomes more efficient and manageable than ever before.
  • 14
    Ascend Reviews

    Ascend

    Ascend

    $0.98 per DFC
    Ascend provides data teams with a streamlined and automated platform that allows them to ingest, transform, and orchestrate their entire data engineering and analytics workloads at an unprecedented speed, achieving results ten times faster than before. This tool empowers teams that are often hindered by bottlenecks to effectively build, manage, and enhance the ever-growing volume of data workloads they face. With the support of DataAware intelligence, Ascend operates continuously in the background to ensure data integrity and optimize data workloads, significantly cutting down maintenance time by as much as 90%. Users can effortlessly create, refine, and execute data transformations through Ascend’s versatile flex-code interface, which supports the use of multiple programming languages such as SQL, Python, Java, and Scala interchangeably. Additionally, users can quickly access critical metrics including data lineage, data profiles, job and user logs, and system health indicators all in one view. Ascend also offers native connections to a continually expanding array of common data sources through its Flex-Code data connectors, ensuring seamless integration. This comprehensive approach not only enhances efficiency but also fosters stronger collaboration among data teams.
  • 15
    DQOps Reviews

    DQOps

    DQOps

    $499 per month
    DQOps is a data quality monitoring platform for data teams that helps detect and address quality issues before they impact your business. Track data quality KPIs on data quality dashboards and reach a 100% data quality score. DQOps helps monitor data warehouses and data lakes on the most popular data platforms. DQOps offers a built-in list of predefined data quality checks verifying key data quality dimensions. The extensibility of the platform allows you to modify existing checks or add custom, business-specific checks as needed. The DQOps platform easily integrates with DevOps environments and allows data quality definitions to be stored in a source repository along with the data pipeline code.
  • 16
    Decube Reviews
    Decube is a comprehensive data management platform designed to help organizations manage their data observability, data catalog, and data governance needs. Our platform is designed to provide accurate, reliable, and timely data, enabling organizations to make better-informed decisions. Our data observability tools provide end-to-end visibility into data, making it easier for organizations to track data origin and flow across different systems and departments. With our real-time monitoring capabilities, organizations can detect data incidents quickly and reduce their impact on business operations. The data catalog component of our platform provides a centralized repository for all data assets, making it easier for organizations to manage and govern data usage and access. With our data classification tools, organizations can identify and manage sensitive data more effectively, ensuring compliance with data privacy regulations and policies. The data governance component of our platform provides robust access controls, enabling organizations to manage data access and usage effectively. Our tools also allow organizations to generate audit reports, track user activity, and demonstrate compliance with regulatory requirements.
  • 17
    ZenML Reviews
    Simplify your MLOps pipelines. ZenML allows you to manage, deploy and scale any infrastructure. ZenML is open-source and free. Two simple commands will show you the magic. ZenML can be set up in minutes and you can use all your existing tools. ZenML interfaces ensure your tools work seamlessly together. Scale up your MLOps stack gradually by changing components when your training or deployment needs change. Keep up to date with the latest developments in the MLOps industry and integrate them easily. Define simple, clear ML workflows and save time by avoiding boilerplate code or infrastructure tooling. Write portable ML codes and switch from experiments to production in seconds. ZenML's plug and play integrations allow you to manage all your favorite MLOps software in one place. Prevent vendor lock-in by writing extensible, tooling-agnostic, and infrastructure-agnostic code.
  • 18
    Kedro Reviews
    Kedro serves as a robust framework for establishing clean data science practices. By integrating principles from software engineering, it enhances the efficiency of machine-learning initiatives. Within a Kedro project, you will find a structured approach to managing intricate data workflows and machine-learning pipelines. This allows you to minimize the time spent on cumbersome implementation tasks and concentrate on addressing innovative challenges. Kedro also standardizes the creation of data science code, fostering effective collaboration among team members in problem-solving endeavors. Transitioning smoothly from development to production becomes effortless with exploratory code that can evolve into reproducible, maintainable, and modular experiments. Additionally, Kedro features a set of lightweight data connectors designed to facilitate the saving and loading of data across various file formats and storage systems, making data management more versatile and user-friendly. Ultimately, this framework empowers data scientists to work more effectively and with greater confidence in their projects.
  • 19
    Secoda Reviews

    Secoda

    Secoda

    $50 per user per month
    With Secoda AI enhancing your metadata, you can effortlessly obtain contextual search results spanning your tables, columns, dashboards, metrics, and queries. This innovative tool also assists in generating documentation and queries from your metadata, which can save your team countless hours that would otherwise be spent on tedious tasks and repetitive data requests. You can easily conduct searches across all columns, tables, dashboards, events, and metrics with just a few clicks. The AI-driven search functionality allows you to pose any question regarding your data and receive quick, relevant answers. By integrating data discovery seamlessly into your workflow through our API, you can perform bulk updates, label PII data, manage technical debt, create custom integrations, pinpoint underutilized resources, and much more. By eliminating manual errors, you can establish complete confidence in your knowledge repository, ensuring that your team has the most accurate and reliable information at their fingertips. This transformative approach not only enhances productivity but also fosters a more informed decision-making process throughout your organization.
  • 20
    Yandex Data Proc Reviews

    Yandex Data Proc

    Yandex

    $0.19 per hour
    You determine the cluster size, node specifications, and a range of services, while Yandex Data Proc effortlessly sets up and configures Spark, Hadoop clusters, and additional components. Collaboration is enhanced through the use of Zeppelin notebooks and various web applications via a user interface proxy. You maintain complete control over your cluster with root access for every virtual machine. Moreover, you can install your own software and libraries on active clusters without needing to restart them. Yandex Data Proc employs instance groups to automatically adjust computing resources of compute subclusters in response to CPU usage metrics. Additionally, Data Proc facilitates the creation of managed Hive clusters, which helps minimize the risk of failures and data loss due to metadata issues. This service streamlines the process of constructing ETL pipelines and developing models, as well as managing other iterative operations. Furthermore, the Data Proc operator is natively integrated into Apache Airflow, allowing for seamless orchestration of data workflows. This means that users can leverage the full potential of their data processing capabilities with minimal overhead and maximum efficiency.
  • 21
    DoubleCloud Reviews

    DoubleCloud

    DoubleCloud

    $0.024 per 1 GB per month
    Optimize your time and reduce expenses by simplifying data pipelines using hassle-free open source solutions. Covering everything from data ingestion to visualization, all components are seamlessly integrated, fully managed, and exceptionally reliable, ensuring your engineering team enjoys working with data. You can opt for any of DoubleCloud’s managed open source services or take advantage of the entire platform's capabilities, which include data storage, orchestration, ELT, and instantaneous visualization. We offer premier open source services such as ClickHouse, Kafka, and Airflow, deployable on platforms like Amazon Web Services or Google Cloud. Our no-code ELT tool enables real-time data synchronization between various systems, providing a fast, serverless solution that integrates effortlessly with your existing setup. With our managed open-source data visualization tools, you can easily create real-time visual representations of your data through interactive charts and dashboards. Ultimately, our platform is crafted to enhance the daily operations of engineers, making their tasks more efficient and enjoyable. This focus on convenience is what sets us apart in the industry.
  • 22
    Tobiko Reviews
    Tobiko is an advanced data transformation platform designed to accelerate data delivery while enhancing efficiency and minimizing errors, all while maintaining compatibility with existing databases. It enables developers to create a development environment without the need to rebuild the entire Directed Acyclic Graph (DAG), as it smartly alters only the necessary components. When a new column is added, there's no requirement to reconstruct everything; the modifications you've made are already in place. Tobiko allows for instant promotion to production without requiring you to redo any of your previous work. It eliminates the hassle of debugging complex Jinja templates by allowing you to define your models directly in SQL. Whether at a startup or a large enterprise, Tobiko scales to meet the needs of any organization. It comprehends the SQL you create and enhances developer efficiency by identifying potential issues during the compilation process. Additionally, comprehensive audits and data comparisons offer validation, ensuring the reliability of the datasets produced. Each modification is carefully analyzed and categorized as either breaking or non-breaking, providing clarity on the impact of changes. In the event of errors, teams can conveniently roll back to previous versions, effectively minimizing production downtime and maintaining operational continuity. This seamless integration of features makes Tobiko not only a tool for data transformation but also a partner in fostering a more productive development environment.
  • 23
    Stackable Reviews
    The Stackable data platform was crafted with a focus on flexibility and openness. It offers a carefully selected range of top-notch open source data applications, including Apache Kafka, Apache Druid, Trino, and Apache Spark. Unlike many competitors that either promote their proprietary solutions or enhance vendor dependence, Stackable embraces a more innovative strategy. All data applications are designed to integrate effortlessly and can be added or removed with remarkable speed. Built on Kubernetes, it is capable of operating in any environment, whether on-premises or in the cloud. To initiate your first Stackable data platform, all you require is stackablectl along with a Kubernetes cluster. In just a few minutes, you will be poised to begin working with your data. You can set up your one-line startup command right here. Much like kubectl, stackablectl is tailored for seamless interaction with the Stackable Data Platform. Utilize this command line tool for deploying and managing stackable data applications on Kubernetes. With stackablectl, you have the ability to create, delete, and update components efficiently, ensuring a smooth operational experience for your data management needs. The versatility and ease of use make it an excellent choice for developers and data engineers alike.
  • 24
    emma Reviews

    emma

    emma

    $99 per month
    Emma gives you the ability to select the most suitable cloud providers and environments, allowing for adaptation to evolving demands while maintaining simplicity and control. It streamlines cloud management by integrating services and automating essential tasks, thereby minimizing complexity. The platform also enhances cloud resource optimization automatically, guaranteeing full utilization and lowering overhead costs. By supporting open standards, it offers flexibility that liberates businesses from dependency on specific vendors. With real-time monitoring and optimization of data traffic, it effectively prevents unexpected cost spikes through efficient resource allocation. You can establish your cloud infrastructure across various providers and environments, whether on-premises, private, hybrid, or public. Management of your consolidated cloud environment is made easy through a single, user-friendly interface. Additionally, you can gain crucial visibility to enhance infrastructure performance and reduce expenditures. By reclaiming control over your entire cloud ecosystem, you can also ensure compliance with regulatory standards while fostering innovation and growth. This comprehensive approach empowers businesses to stay competitive in an ever-changing digital landscape.
  • 25
    DataHub Reviews
    DataHub is a versatile open-source metadata platform crafted to enhance data discovery, observability, and governance within various data environments. It empowers organizations to easily find reliable data, providing customized experiences for users while avoiding disruptions through precise lineage tracking at both the cross-platform and column levels. By offering a holistic view of business, operational, and technical contexts, DataHub instills trust in your data repository. The platform features automated data quality assessments along with AI-driven anomaly detection, alerting teams to emerging issues and consolidating incident management. With comprehensive lineage information, documentation, and ownership details, DataHub streamlines the resolution of problems. Furthermore, it automates governance processes by classifying evolving assets, significantly reducing manual effort with GenAI documentation, AI-based classification, and intelligent propagation mechanisms. Additionally, DataHub's flexible architecture accommodates more than 70 native integrations, making it a robust choice for organizations seeking to optimize their data ecosystems. This makes it an invaluable tool for any organization looking to enhance their data management capabilities.
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next