Netstratum Data Analytics Platform (NDAP)

NDAP provides these essential capabilities:
Data Pipelines

NDAP provides a data ingestion service that simplifies and automates the difficult and time consuming task of building, running, and managing data pipelines.

Data Preparation

NDAP provides an easy and interactive way to visualize, transform, and cleanse data. It helps to derive new schemas and operationalize the data preparation with a few clicks.

App Development

As an integrated application development framework, NDAP provides standardization and deep integration with diverse big data technologies with easy-to-use APIs to build, deploy and manage complex data analytics applications in the cloud or on-premises.

Metadata & Lineage

NDAP automatically captures technical, business and operational metadata and tracks lineage by understanding changing datasets and flow of data. It provides an audit log for easy traceability for data quality and compliance needs.

Security & Operations

NDAP offers sophisticated security, authentication, authorization and encryption. It provides a robust and portable production runtime environment for secure deployment and management of data lakes and data applications on Hadoop and Spark.

NDAP Features

Rapid development

Developer SDK and APIs with abstractions over common data processing patterns; Sandbox mode, programmatic and UI driven debugging; In-memory mode and testing framework to simplify testing; Support for cutting edge Cloud, Apache Hadoop and Apache Spark technologies.

Enterprise Ready

Metadata repository with automatic technical and operational metadata capture; Business metadata annotations; Data discovery through search based on metadata; Data governance with dataset and field level lineage and auditing; Integration with enterprise security systems.

Seamless operations

REST APIs for every interaction; Time and process-based scheduling; Standardized logs and metrics for all execution environments.

Portable Runtime Environments

Build once, run anywhere through portability across runtime environments such as Apache Hadoop YARN and Docker.

Extensible and Reusable

Templates and blueprints for common use-cases; Hub for sharing pre-built plugins, applications and solutions; Extensible APIs for security, metadata, runtimes and storage.

Hybrid and multi-cloud

Interoperability across on-premises and Cloud environments; Support for all major public cloud providers such as Amazon Web Services, Microsoft Azure and Google Cloud Platform.


Integrated With All Data

Pipelines provide connectors to relational databases, flat files, mainframes, cloud services, NoSQL, and more.

Increased flexibility

Through portability across on-premises and public cloud environments.

Reduced complexity

Pipelines reduce complexity through a graphical interface, code-free transformations, and reusable templates.

Improved Data Trustworthiness

Through data quality libraries, metadata and lineage capture, audit logging.


Wrangler allows you to visually and interactively cleanse and prepare raw data, with the aim of making it consumable for further processing. It provides a standardized UI driven interactive flow that takes the pain out of preprocessing tasks for data engineering, data science and data analysis.
Code Free Transformations

Interactive, code-free transformations with feedback at each step using a powerful graphical UI

Extensible, comprehensive transformation library

Comprehensive library with over 1000+ built-in transformations; Extensible API for adding more transformations

Comprehensive Data Source Support

Built-in connections to popular cloud and on-prem data sources such as relational databases, file systems, object stores such as AWS S3 and Cloud Storage, Kafka, NoSQL stores

Operationalization Using Pipelines

One-click pipeline creation for creating scalable and reliable pipelines for mission critical environments

Automatic data quality and profiling

Data quality indicators for determining data quality; data quality library for improving trust and quality; profiling to understand data distribution and column relationships after every transformation


Analytics provides a simple, interactive, UI-driven approach to machine learning. It provides a seamless, automated interface for users to easily develop, train, test, evaluate and deploy their machine learning models. It reduces the need for ad-hoc custom tooling and promotes reusability and collaboration.
UI Driven Data Wrangler and Cleansing

Seamless, integrated experience from data preparation and cleansing to model development, evaluation and deployment.

Support for Popular ML Libraries

Out of the box support for common ML libraries such as SparkML.

Scoring Plugins for Running Predictions

Built-in scoring plugins take you from model development to running predictions on data in a few seconds.

Model Evaluation

Integrated metrics and visualization that provides rich summaries and graphs for evaluating model performance.

Automated Training and Test Data Split

Automated splitting into training and test datasets reduces the need for custom tooling.

Hyperparameter Tuning

Switches and knobs for advanced users to tune model performance using hyperparameters.

Rules Engine

Rules Engine provides a way for business analysts to create and manage a knowledge base of data transformation rules that need to be automatically applied to your data. It contains an intuitive UI for business analysts to set up business rules that can be executed in a data pipeline.
Code-free Business Rules UI

Intuitive, code-free UI for business analysts to build and manage transformation rules

Fully Integrated

Available as a library to integrate with JBoss, Spring, WebLogic and SQL tools.

Easy Governance

Centralized repository of policies and transformation rules.

Flexible and Intuitive Rules Management

Easy organization and grouping of rules using rulebooks; Intuitive business UI for creation and management of rules and rulebooks.


Integrates with NDAP data pipelines for horizontal scalability.

© 2019 Netstratum. All rights reserved.