CSV File from Amazon S3

How to pull data from CSV files on Amazon S3

Create a flow with Amazon S3 as the Input

Specify the root folder path under the given bucket. Make sure, this path is static and is dedicated for the files that needs to be imported using this flow. Do not mix files from different flows under the same root folder path. Root path will be used as a prefix while polling for new files on the S3 bucket.
Specify the file name pattern using Java regular expression pattern. If these CSV files are generated by another flow or any other Kafka Sink Connector you may want to use #{partition} to be replaced with the partition number of the run process.
Provide the file extension that will be used as a suffix while polling for files from S3 bucket.
Specify the number of partitions. Specify a value of 1 when no partitions are used. If a value greater than 1 is entered, then the files will be polled in multiple requests with #{partition} value set to 1, 2, 3... until the number of partitions specified. Use this only when the CSV files are written with different partition numbers in the file name.

Refer to OTBI Input's Schema Setup for more details on how to setup the schema for the CSV files.

Complete the remaining setups and click Submit to start pulling data from the CSV files on S3 Bucket.

Last updated 5 years ago

Was this helpful?