Installation
Steps to install all prerequisite components including FLOW
Using Docker
Install Docker Desktop: https://www.docker.com/products/docker-desktop
Make sure the following ports are not used
2181
3306
8090
9092
You must allocate a minimum of 8 GB memory for the docker.

Download flow-docker.zip & unzip
unzip flow-docker.zip
cd flow-docker
docker-compose up --build
Launch flow using any modern browser http://localhost:8090
Sign-in as demo / demo
The above docker container is NOT meant for production use!

Manual Installation
Prepare /usr/local/cloudio directory
sudo mkdir /usr/local/cloudio
sudo chown $USER:admin /usr/local/cloudio
Install Apache Kafka
Install Kafka 2.5.0 https://www.apache.org/dyn/closer.cgi?path=/kafka/2.5.0/kafka_2.12-2.5.0.tgz https://kafka.apache.org/quickstart
tar -xzf kafka_2.12-2.5.0.tgz
mv kafka_2.12-2.5.0 /usr/local/cloudio/kafka
Install MySQL or an Oracle database
Install FLOW
Once you obtain a license from CloudIO, follow the download instructions to io-flow.zip. FLOW is made of the following microservice components
Input - responsible for running the input schedule and pull data from the source systems
Transformer - responsible for transform & mapping process
Counter - a kafka stream process responsible to update event counts under various stages
WebSocket - a jetty server that run the UI service
Output - responsible to load data into the target data lake
S3Worker - responsible to perform batch processes, such as upload to Amazon S3, SQL Loader, Push to PowerBI Datasets etc.
Master - the master node that coordinates all other microservices
Profiler - responsible for profiling the incoming events (optional)
Processed - a kafka stream process that copies the output data into a processed topic (optional)
# unpack the downloaded io-flow.zip
unzip io-flow.zip
cd io-flow
# configure template/io.properties
vi template/io.properties
# start all services in one shot
./bin/start-all.sh
# optionally tail all log files
tail -f */io-flow.log
Sample io.properties
# database connection properties
db.driverClassName=com.mysql.cj.jdbc.Driver
db.url=jdbc:mysql://localhost:3306/kafka
db.username=kafka
db.password=kafka
db.type=MYSQL
db.minimumIdle=1
db.maximumPoolSize=100
db.maxLifetime=360000
db.idleTimeout=45000
db.serverTimezone=America/Los_Angeles
db.cachePrepStmts=true
db.prepStmtCacheSize=250
db.prepStmtCacheSqlLimit=2048
sync.host=localhost
sync.port=8090
sync.contextPath=/
bootstrapServers=kafka.cloudio.io:9021,kafka.cloudio.io:9022,kafka.cloudio.io:9023,kafka.cloudio.io:9024,kafka.cloudio.io:9025,kafka.cloudio.io:9026
zookeeper.connect=kafka.cloudio.io:2181
kafka.stateDir=kafka-state
kafka.s3.stateDir=s3-state
kafka.totalWorkers=40
kafka.replication.factor=2
kafka.topic.partitions=20
kafka.instanceType=Transformer,Counter,WebSocket,Input,Output,Worker,S3Worker,Master,Profiler
Last updated
Was this helpful?