Overview

High-level, overview information about FLOW

FLOW is a modern, scalable, real time data pipeline. Unlike other cloud based solution, FLOW can be deployed on-premise or on private cloud. Which means, your data stays within your systems/network.

Use FLOW to stream data from various inputs such as Relational Databases like Oracle, SQL Server, PostgreSQL, MySQL etc., SaaS systems such as Salesforce, Workday or any system that exposes REST API or database access and Cloud Storage such as Amazon S3, Google Drive, Azure Blob Storage and traditional sources like SFTP & File System.

Data can be pulled in a periodic schedule e.g. every 5 minutes or everyday at 3 a.m. or real-time using Change Data Capture (CDC) for supported inputs, Event Sources such as Apache Kafka or can be pushed from the source systems using web-hooks.

Once the data is imported into the pipeline, it is streamed through various stages such as Data Masking of some sensitive parts of your data, validations, transformation, actions and mapping to the output schema and then automatically loaded into the output (data lake such as Redshift, Azure SQL, Snowflake, S3 or any Database) for analysis and reporting.

Last updated