FLOW
  • FLOW
  • Getting Started with FLOW
    • Overview
    • The Basics
      • FLOW Concepts
      • Events & Schemas
        • Event Metadata
        • Sample Event
        • Sample Schema
      • Connections
    • Architecture
    • FLOW Schedule
      • Cron Expression
  • Stages
    • Validations
    • Transform
    • Actions
    • Mapper
  • Recipes
  • Scripts
  • Webhooks
  • REST Connector
  • Flow Monitoring
  • Live Monitoring
  • Advanced Topics
    • Post Process
    • Naming Policy
    • Manage Patches
  • Installation
  • FAQ
  • Tutorials
    • Creating a Connection
    • Creating a Flow
    • Creating a Script
    • Creating a Recipe
  • How To
    • OTBI Input
    • CSV File from Amazon S3
  • Changelog
  • Environment Variables
  • Data Security & Privacy
Powered by GitBook
On this page
  • Steps to create a new Flow
  • Step 1: Navigate to Home Page
  • Step 2: Click on "New Flow" button
  • Step 3: Choose an Output
  • Step 4: Choose an Output Connection
  • Step 5: Choose an Input
  • Step 6: Enter Input & Output Settings
  • Step 7: Choose an Incremental Policy
  • Step 8: Choose a schedule policy
  • Step 9: Click Submit

Was this helpful?

  1. Tutorials

Creating a Flow

Defining the pipeline between Salesforce & S3 with copy to Redshift

PreviousCreating a ConnectionNextCreating a Script

Last updated 5 years ago

Was this helpful?

Steps to create a new Flow

Follow the below steps to create a flow with Salesforce as Input & S3 as Output along with a copy option to Redshift

Step 1: Navigate to Home Page

Step 2: Click on "New Flow" button

Step 3: Choose an Output

Step 4: Choose an Output Connection

Step 5: Choose an Input

Step 6: Enter Input & Output Settings

  1. Give a friendly name to the flow

  2. Accept the default flow code, that is generated based on the name

  3. Uncheck "Publish Transformation" if you want to perform some transformations on the flow events

  4. Uncheck "Publish Mapping" if you want to manually perform some complex mapping between input & output

  5. Uncheck if you want to allow events that doesn't comply with the input schema

  6. Choose the Input connection

  7. Check to copy the file uploaded to S3 into Redshift

  8. Choose the Redshift connection to be used for the copy command

  9. Enter the Salesforce object name to be replicated from Input to Output

  10. Specify the fetch size to be used which performing the JDBC query

  11. Specify the batch size used to determine the topic partition. Specify a large number if you want the data to load in the order in which it was queried.

  12. Specify a partition size greater than 1 if you want data to be imported in parallel across nodes - You must run the Input service on multiple nodes for this to be more effective.

  13. Specify a comma separated list of columns to be included in the query, if left blank, all columns will be fetched

  14. Specify a comma separated list of columns to be excluded. Will be ignored, if include list is provided.

  15. Specify a local directory path to be used to temporarily store the Salesforce bulk export files for the initial run.

  16. Choose an Incremental Policy to be used for this flow. Refer to Step 7

  17. Choose the creation date column if available. It will be used to determine if the event is newly inserted or updated

  18. Specify the last update date column to be used to track the updates

Step 7: Choose an Incremental Policy

  1. Full dump and load - will delete all the rows from the target table and reload full data again on every run

  2. Incremental Using Numeric ID Column - will use a numeric ID column to incrementally load newly added rows with ID higher than the max ID of the previous run. e.g. inventory transactions or ledger transactions where there won't be any updates and only inserts with a running sequential ID used for the transaction ID column.

  3. Incremental Using Last Update Date Column - will use a timestamp column to fetch incremental data with the timestamp value greater than the max value of the previous run

  4. One Time Load - will just load data only once and the flow will not be schedule again. The flow status will be changed to Complete after the one time load.

Step 8: Choose a schedule policy

  1. Cron Expression - use this to schedule at a specific time of the day or day of the week/month etc using a cron expression

  2. Fixed Interval - use this to schedule every x minutes

  3. After Parent Flow - use this to define dependency between flows. You can choose to run this flow soon after the parent flow runs irrespective of the status.

  4. After Parent Flow Success - similar to the above, but only runs if the parent flow completed without any errors.

  5. After Parent Flow Failure - similar to the above, but only runs when the parent flow completed with errors.

Step 9: Click Submit

Congratulations, you have just created your first flow