Business Software for Hadoop

  • 1
    Apache Atlas Reviews

    Apache Atlas

    Apache Software Foundation

    Atlas serves as a versatile and scalable suite of essential governance services, empowering organizations to efficiently comply with regulations within the Hadoop ecosystem while facilitating integration across the enterprise's data landscape. Apache Atlas offers comprehensive metadata management and governance tools that assist businesses in creating a detailed catalog of their data assets, effectively classifying and managing these assets, and fostering collaboration among data scientists, analysts, and governance teams. It comes equipped with pre-defined types for a variety of both Hadoop and non-Hadoop metadata, alongside the capability to establish new metadata types tailored to specific needs. These types can incorporate primitive attributes, complex attributes, and object references, and they can also inherit characteristics from other types. Entities, which are instances of these types, encapsulate the specifics of metadata objects and their interconnections. Additionally, REST APIs enable seamless interaction with types and instances, promoting easier integration and enhancing overall functionality. This robust framework not only streamlines governance processes but also supports a culture of data-driven collaboration across the organization.
  • 2
    Microsoft Power Query Reviews
    Power Query provides a user-friendly solution for connecting, extracting, transforming, and loading data from a variety of sources. Acting as a robust engine for data preparation and transformation, Power Query features a graphical interface that simplifies the data retrieval process and includes a Power Query Editor for implementing necessary changes. The versatility of the engine allows it to be integrated across numerous products and services, meaning the storage location of the data is determined by the specific application of Power Query. This tool enables users to efficiently carry out the extract, transform, and load (ETL) processes for their data needs. With Microsoft’s Data Connectivity and Data Preparation technology, users can easily access and manipulate data from hundreds of sources in a straightforward, no-code environment. Power Query is equipped with support for a multitude of data sources through built-in connectors, generic interfaces like REST APIs, ODBC, OLE, DB, and OData, and even offers a Power Query SDK for creating custom connectors tailored to individual requirements. This flexibility makes Power Query an indispensable asset for data professionals seeking to streamline their workflows.
  • 3
    SAS Data Loader for Hadoop Reviews
    Effortlessly load your data into or extract it from Hadoop and data lakes, ensuring it is primed for generating reports, visualizations, or conducting advanced analytics—all within the data lakes environment. This streamlined approach allows you to manage, transform, and access data stored in Hadoop or data lakes through a user-friendly web interface, minimizing the need for extensive training. Designed specifically for big data management on Hadoop and data lakes, this solution is not simply a rehash of existing IT tools. It allows for the grouping of multiple directives to execute either concurrently or sequentially, enhancing workflow efficiency. Additionally, you can schedule and automate these directives via the public API provided. The platform also promotes collaboration and security by enabling the sharing of directives. Furthermore, these directives can be invoked from SAS Data Integration Studio, bridging the gap between technical and non-technical users. It comes equipped with built-in directives for various tasks, including casing, gender and pattern analysis, field extraction, match-merge, and cluster-survive operations. For improved performance, profiling processes are executed in parallel on the Hadoop cluster, allowing for the seamless handling of large datasets. This comprehensive solution transforms the way you interact with data, making it more accessible and manageable than ever.
  • 4
    SAS MDM Reviews
    Combine master data management solutions with those found in SAS 9.4, where SAS MDM operates as a web-based interface accessible via the SAS Data Management Console. This system delivers a cohesive and precise representation of organizational data by consolidating information from multiple sources into a singular master record. Additionally, SAS® Data Remediation and SAS® Task Manager synergistically enhance SAS MDM's capabilities, as well as those of other SAS products, including SAS® Data Management and SAS® Data Quality. Through SAS Data Remediation, users can address and rectify issues arising from business rules in both batch jobs and real-time processes within SAS MDM. Meanwhile, SAS Task Manager serves as a supportive tool that integrates seamlessly with SAS Workflow technologies, allowing users to manage workflows initiated by other SAS applications with ease. By enabling the initiation, cessation, and transition of workflows uploaded to the SAS Workflow server, this ecosystem empowers organizations to maintain efficient data management practices. Overall, the integration of these technologies creates a robust framework for handling master data effectively.
  • 5
    Apache Knox Reviews

    Apache Knox

    Apache Software Foundation

    The Knox API Gateway functions as a reverse proxy, prioritizing flexibility in policy enforcement and backend service management for the requests it handles. It encompasses various aspects of policy enforcement, including authentication, federation, authorization, auditing, dispatch, host mapping, and content rewriting rules. A chain of providers, specified in the topology deployment descriptor associated with each Apache Hadoop cluster secured by Knox, facilitates this policy enforcement. Additionally, the cluster definition within the descriptor helps the Knox Gateway understand the structure of the cluster, enabling effective routing and translation from user-facing URLs to the internal workings of the cluster. Each secured Apache Hadoop cluster is equipped with its own REST APIs, consolidated under a unique application context path. Consequently, the Knox Gateway can safeguard numerous clusters while offering REST API consumers a unified endpoint for seamless access. This design enhances both security and usability by simplifying interactions with multiple backend services.
  • 6
    The Respond Analyst Reviews
    Enhance investigative processes and boost analyst efficiency with an advanced XDR Cybersecurity Solution. The Respond Analyst™, powered by an XDR Engine, streamlines the identification of security threats by transforming resource-heavy monitoring and initial assessments into detailed and uniform investigations. In contrast to other XDR solutions, the Respond Analyst employs probabilistic mathematics and integrated reasoning to connect various pieces of evidence, effectively evaluating the likelihood of malicious and actionable events. By doing so, it significantly alleviates the workload on security operations teams, allowing them to spend more time on proactive threat hunting rather than chasing down false positives. Furthermore, the Respond Analyst enables users to select top-tier controls to enhance their sensor infrastructure. It also seamlessly integrates with leading security vendor solutions across key areas like EDR, IPS, web filtering, EPP, vulnerability scanning, authentication, and various other categories, ensuring a comprehensive defense strategy. With such capabilities, organizations can expect not only improved response times but also a more robust security posture overall.
  • 7
    Gurucul Reviews
    Our security controls, driven by data science, facilitate the automation of advanced threat detection, remediation, and response. Gurucul’s Unified Security and Risk Analytics platform addresses the crucial question: Is anomalous behavior truly a risk? This unique capability sets us apart in the industry. We prioritize your time by avoiding alerts related to non-risky anomalous activities. By leveraging context, we can accurately assess whether certain behaviors pose a risk, as understanding the context is essential. Merely reporting what is occurring lacks value; instead, we emphasize notifying you when a genuine threat arises, which exemplifies the Gurucul advantage. This actionable information empowers your decision-making. Our platform effectively harnesses your data, positioning us as the only security analytics provider capable of seamlessly integrating all your data from the outset. Our enterprise risk engine can absorb data from various sources, including SIEMs, CRMs, electronic medical records, identity and access management systems, and endpoints, ensuring comprehensive threat analysis. We’re committed to maximizing the potential of your data to enhance security.
  • 8
    OpenText Data Privacy & Protection Foundation Reviews
    OpenText Data Privacy & Protection Foundation (Voltage) enables organizations to secure sensitive information with a modern, quantum-resilient approach that supports both operational continuity and regulatory compliance. Instead of relying on traditional encryption that breaks workflows, it uses NIST-approved, format-preserving methods that preserve data usability while protecting high-value fields. The platform provides persistent protection, securing data no matter where it lives or how it moves—across cloud infrastructures, analytics pipelines, and distributed applications. With stateless key management, performance stays high even at massive volumes, making it ideal for enterprise-scale deployments. Global organizations trust OpenText because its technologies meet stringent certifications, including FIPS 140-2, Common Criteria, and NIST SP 800-38G. Deep integrations across AWS, Azure, Google Cloud, Snowflake, Hadoop, Databricks, and more ensure seamless adoption without architectural overhaul. This enables businesses to modernize, migrate, or analyze data safely without exposing sensitive information. Ultimately, the platform helps reduce compliance risk, streamline governance, and future-proof data protection strategies.
  • 9
    Mage Static Data Masking Reviews
    Mage™ offers comprehensive Static Data Masking (SDM) and Test Data Management (TDM) functionalities that are fully compatible with Imperva’s Data Security Fabric (DSF), ensuring robust safeguarding of sensitive or regulated information. This integration occurs smoothly within an organization’s current IT infrastructure and aligns with existing application development, testing, and data processes, all without necessitating any alterations to the existing architectural setup. As a result, organizations can enhance their data security while maintaining operational efficiency.
  • 10
    Mage Dynamic Data Masking Reviews
    The Mage™ Dynamic Data Masking module, part of the Mage data security platform, has been thoughtfully crafted with a focus on the needs of end customers. Developed in collaboration with clients, Mage™ Dynamic Data Masking effectively addresses their unique requirements and challenges. Consequently, this solution has advanced to accommodate virtually every potential use case that enterprises might encounter. Unlike many competing products that often stem from acquisitions or cater to niche scenarios, Mage™ Dynamic Data Masking is designed to provide comprehensive protection for sensitive data accessed by application and database users in production environments. Additionally, it integrates effortlessly into an organization’s existing IT infrastructure, eliminating the need for any substantial architectural modifications, thus ensuring a smoother transition for businesses implementing this solution. This strategic approach reflects a commitment to enhancing data security while prioritizing user experience and operational efficiency.
  • 11
    Acxiom Real Identity Reviews
    Real Identity™ provides rapid, sub-second decision-making capabilities that facilitate timely and relevant messaging. This innovative platform empowers leading global brands to accurately identify individuals and ethically engage with them at any location and time, thereby fostering significant experiences. With the ability to connect with audiences at scale and with precision, brands can enhance every customer interaction. Additionally, Real Identity allows companies to effectively manage their identity systems by utilizing five decades of expertise in data and identity, coupled with cutting-edge artificial intelligence and machine learning methodologies. In the fast-paced adtech sector, swift access to identity and data is essential for enabling personalization and informed decision-making. As the landscape evolves away from cookies, first-party data signals will become crucial for driving these initiatives, ensuring that communication remains vibrant between individuals, brands, and publishers. By crafting impactful experiences across all channels, businesses can not only impress current customers and prospects but also maintain compliance with regulations and outpace their competitors. Ultimately, Real Identity™ positions brands to thrive in a dynamic environment while enhancing their customer engagement strategies.
  • 12
    Okera Reviews
    Complexity is the enemy of security. Simplify and scale fine-grained data access control. Dynamically authorize and audit every query to comply with data security and privacy regulations. Okera integrates seamlessly into your infrastructure – in the cloud, on premise, and with cloud-native and legacy tools. With Okera, data users can use data responsibly, while protecting them from inappropriately accessing data that is confidential, personally identifiable, or regulated. Okera’s robust audit capabilities and data usage intelligence deliver the real-time and historical information that data security, compliance, and data delivery teams need to respond quickly to incidents, optimize processes, and analyze the performance of enterprise data initiatives.
  • 13
    Apache Sentry Reviews

    Apache Sentry

    Apache Software Foundation

    Apache Sentry™ serves as a robust system for implementing detailed role-based authorization for both data and metadata within a Hadoop cluster environment. Achieving Top-Level Apache project status after graduating from the Incubator in March 2016, Apache Sentry is recognized for its effectiveness in managing granular authorization. It empowers users and applications to have precise control over access privileges to data stored in Hadoop, ensuring that only authenticated entities can interact with sensitive information. Compatibility extends to a range of frameworks, including Apache Hive, Hive Metastore/HCatalog, Apache Solr, Impala, and HDFS, though its primary focus is on Hive table data. Designed as a flexible and pluggable authorization engine, Sentry allows for the creation of tailored authorization rules that assess and validate access requests for various Hadoop resources. Its modular architecture increases its adaptability, making it capable of supporting a diverse array of data models within the Hadoop ecosystem. This flexibility positions Sentry as a vital tool for organizations aiming to manage their data security effectively.
  • 14
    Apache Bigtop Reviews

    Apache Bigtop

    Apache Software Foundation

    Bigtop is a project under the Apache Foundation designed for Infrastructure Engineers and Data Scientists who need a thorough solution for packaging, testing, and configuring leading open source big data technologies. It encompasses a variety of components and projects, such as Hadoop, HBase, and Spark, among others. By packaging Hadoop RPMs and DEBs, Bigtop simplifies the management and maintenance of Hadoop clusters. Additionally, it offers an integrated smoke testing framework, complete with a collection of over 50 test files to ensure reliability. For those looking to deploy Hadoop from scratch, Bigtop provides vagrant recipes, raw images, and in-progress docker recipes. The framework is compatible with numerous Operating Systems, including Debian, Ubuntu, CentOS, Fedora, and openSUSE, among others. Moreover, Bigtop incorporates a comprehensive set of tools and a testing framework that evaluates various aspects, such as packaging, platform, and runtime, which are essential for both new deployments and upgrades of the entire data platform, rather than just isolated components. This makes Bigtop a vital resource for anyone aiming to streamline their big data infrastructure.
  • 15
    Secuvy AI Reviews
    Secuvy, a next-generation cloud platform, automates data security, privacy compliance, and governance via AI-driven workflows. Unstructured data is treated with the best data intelligence. Secuvy, a next-generation cloud platform that automates data security, privacy compliance, and governance via AI-driven workflows is called Secuvy. Unstructured data is treated with the best data intelligence. Automated data discovery, customizable subjects access requests, user validations and data maps & workflows to comply with privacy regulations such as the ccpa or gdpr. Data intelligence is used to locate sensitive and private information in multiple data stores, both in motion and at rest. Our mission is to assist organizations in protecting their brand, automating processes, and improving customer trust in a world that is rapidly changing. We want to reduce human effort, costs and errors in handling sensitive data.
  • 16
    iFinder Reviews

    iFinder

    IntraFind Software

    IntraFind's iFinder offers a comprehensive search solution that serves as a hub for all of your organization’s data. This platform seamlessly connects to various data sources within your enterprise. As your data repositories expand, iFinder prepares you for the future: leveraging Elasticsearch technology, it can effortlessly scale to accommodate any data volume. By utilizing artificial intelligence, it enhances search outcomes, providing intelligent enterprise search capabilities. Whether your essential documents and information reside on company drives, intranet pages, wikis, or email systems, iFinder streamlines the process of locating them. Embrace the next phase of your organization's digital evolution by centralizing access to all data through our innovative enterprise search solution. By implementing iFinder, you're not just enhancing search efficiency; you're also optimizing how your team interacts with information.
  • 17
    NVMesh Reviews
    Excelero offers a low-latency distributed block storage solution tailored for web-scale applications. With NVMesh, users can access shared NVMe technology over any network while maintaining compatibility with both local and distributed file systems. The platform includes a sophisticated management layer that abstracts the underlying hardware, supports CPU offload, and facilitates the creation of logical volumes with built-in redundancy, all while providing centralized management and monitoring capabilities. This allows applications to leverage the speed, throughput, and IOPS of local NVMe devices combined with the benefits of centralized storage, all without being tied to proprietary hardware, ultimately lowering the total cost of ownership for storage. Additionally, NVMesh's distributed block layer empowers unmodified applications to tap into pooled NVMe storage resources, achieving performance levels comparable to local access. Moreover, users can dynamically create arbitrary block volumes that can be accessed by any host equipped with the NVMesh block client, enhancing flexibility and scalability in storage deployments. This innovative approach not only optimizes resource utilization but also simplifies management across diverse infrastructures.
  • 18
    lakeFS Reviews
    lakeFS allows you to control your data lake similarly to how you manage your source code, facilitating parallel pipelines for experimentation as well as continuous integration and deployment for your data. This platform streamlines the workflows of engineers, data scientists, and analysts who are driving innovation through data. As an open-source solution, lakeFS enhances the resilience and manageability of object-storage-based data lakes. With lakeFS, you can execute reliable, atomic, and versioned operations on your data lake, encompassing everything from intricate ETL processes to advanced data science and analytics tasks. It is compatible with major cloud storage options, including AWS S3, Azure Blob Storage, and Google Cloud Storage (GCS). Furthermore, lakeFS seamlessly integrates with a variety of modern data frameworks such as Spark, Hive, AWS Athena, and Presto, thanks to its API compatibility with S3. The platform features a Git-like model for branching and committing that can efficiently scale to handle exabytes of data while leveraging the storage capabilities of S3, GCS, or Azure Blob. In addition, lakeFS empowers teams to collaborate more effectively by allowing multiple users to work on the same dataset without conflicts, making it an invaluable tool for data-driven organizations.
  • 19
    Prodea Reviews
    Prodea enables the rapid launch of secure, scalable, and globally compliant connected products and services within a six-month timeframe. As the sole provider of an IoT platform-as-a-service (PaaS) tailored for manufacturers of mass-market consumer home products, Prodea offers three core services: the IoT Service X-Change Platform, which allows for the swift introduction of connected products into diverse global markets with minimal development effort; Insight™ Data Services, which provides critical insights derived from user and product usage analytics; and the EcoAdaptor™ Service, designed to enhance the value of products through seamless cloud-to-cloud integration and interoperability with various other products and services. Prodea has successfully assisted its global brand partners in launching over 100 connected products, averaging less than six months for completion, across six continents. This achievement is largely attributed to the Prodea X5 Program, which integrates with the three primary cloud services to support brands in evolving their systems effectively and efficiently. Additionally, this comprehensive approach ensures that manufacturers can adapt to changing market demands while maximizing their connectivity capabilities.
  • 20
    GO+ Reviews
    GO+ provides development resources tailored for service providers, enabling them to create additional offerings for their business clientele. The platform is designed to handle a large volume of devices simultaneously through advanced algorithms. This allows service providers to focus less on the challenges of developing new services for their customers. At the heart of the platform lies an analytical decision-making engine that utilizes Granular Computing for intricate data processing and analysis with complex event handling. We leverage cloud technology that seamlessly integrates business logic from real devices directly to the cloud environment. This scalability ensures that we can offer cost-effective solutions. Additionally, the platform's scripting engine equips developers with a comprehensive suite of tools to craft highly customized IoT services applicable across various industries. GO+ is constructed on cutting-edge cloud computing technology, ensuring optimal performance and reliability. Ultimately, GO+ empowers service providers to innovate without the typical constraints associated with service development.
  • 21
    Foghub Reviews
    Foghub streamlines the integration of IT and OT, enhancing data engineering and real-time intelligence at the edge. Its user-friendly, cross-platform design employs an open architecture to efficiently manage industrial time-series data. By facilitating the critical link between operational components like sensors, devices, and systems, and business elements such as personnel, processes, and applications, Foghub enables seamless automated data collection and engineering processes, including transformations, advanced analytics, and machine learning. The platform adeptly manages a diverse range of industrial data types, accommodating significant variety, volume, and velocity, while supporting a wide array of industrial network protocols, OT systems, and databases. Users can effortlessly automate data gathering related to production runs, batches, parts, cycle times, process parameters, asset health, utilities, consumables, and operator performance. Built with scalability in mind, Foghub provides an extensive suite of features to efficiently process and analyze large amounts of data, ensuring that businesses can maintain optimal performance and decision-making capabilities. As industries evolve and data demands increase, Foghub remains a pivotal solution for achieving effective IT/OT convergence.
  • 22
    Brainwave GRC Reviews
    Brainwave is transforming how you evaluate user access! With an innovative user interface, enhanced predictive controls, and comprehensive risk-scoring features, you can now conduct in-depth access risk analyses. The Autonomous Identity solution allows your teams to operate more effectively with a user-friendly, industry-recognized tool that speeds up your identity management initiatives (IGA). This empowers organizations to assess and make informed decisions regarding access to shared files and folders. You can inventory, categorize, review access, and ensure compliance irrespective of the environment, whether it be file servers, NAS, Sharepoint, Office 365, and beyond. Our flagship offering, Brainwave Identity GRC, is packed with analytical tools that make the most of your access inventory. Enjoy constant visibility across all resources at any given moment. Furthermore, Brainwave’s extensive inventory serves as an entitlement catalog that spans across various infrastructure, business applications, and data access points, ensuring a comprehensive overview of user permissions. This holistic approach promotes better security and informed decision-making.
  • 23
    Apache Kylin Reviews

    Apache Kylin

    Apache Software Foundation

    Apache Kylin™ is a distributed, open-source Analytical Data Warehouse designed for Big Data, aimed at delivering OLAP (Online Analytical Processing) capabilities in the modern big data landscape. By enhancing multi-dimensional cube technology and precalculation methods on platforms like Hadoop and Spark, Kylin maintains a consistent query performance, even as data volumes continue to expand. This innovation reduces query response times from several minutes to just milliseconds, effectively reintroducing online analytics into the realm of big data. Capable of processing over 10 billion rows in under a second, Kylin eliminates the delays previously associated with report generation, facilitating timely decision-making. It seamlessly integrates data stored on Hadoop with popular BI tools such as Tableau, PowerBI/Excel, MSTR, QlikSense, Hue, and SuperSet, significantly accelerating business intelligence operations on Hadoop. As a robust Analytical Data Warehouse, Kylin supports ANSI SQL queries on Hadoop/Spark and encompasses a wide array of ANSI SQL functions. Moreover, Kylin’s architecture allows it to handle thousands of simultaneous interactive queries with minimal resource usage, ensuring efficient analytics even under heavy loads. This efficiency positions Kylin as an essential tool for organizations seeking to leverage their data for strategic insights.
  • 24
    Apache Zeppelin Reviews
    A web-based notebook facilitates interactive data analytics and collaborative documentation using SQL, Scala, and other languages. With an IPython interpreter, it delivers a user experience similar to that of Jupyter Notebook. The latest version introduces several enhancements, including a dynamic form at the note level, a note revision comparison tool, and the option to execute paragraphs sequentially rather than simultaneously, as was the case in earlier versions. Additionally, an interpreter lifecycle manager ensures that idle interpreter processes are automatically terminated, freeing up resources when they are not actively being utilized. This improvement not only optimizes performance but also enhances the overall user experience.
  • 25
    SOLIXCloud CDP Reviews
    SOLIXCloud CDP provides a cloud-based data management solution tailored for contemporary data-centric businesses. Utilizing open-source and cloud-native technologies, it enables organizations to effectively handle and analyze their structured, semi-structured, and unstructured data, facilitating advanced analytics, regulatory compliance, infrastructure efficiency, and robust data security. Key components of this platform include Solix Connect for efficient data ingestion, Solix Data Governance, Solix Metadata Management, and Solix Search, collectively forming a holistic framework for managing cloud data. This framework supports the development and operation of data-driven applications, including SQL data warehouses, machine learning models, and artificial intelligence systems, while addressing the increasing complexities associated with data management regulations, data retention policies, and consumer privacy concerns. In this way, SOLIXCloud CDP empowers companies to navigate the evolving landscape of data management with confidence.
MongoDB Logo MongoDB