
Lead Cloud Era Consultant
GURU INFOTECH,INCContract
Required Skillset:
Cloudera CDP (Public Cloud / Private Cloud Base) Cloudera DataFlow (CDF) Cloudera Flow Management (CFM – NiFi, NiFi Registry) Cloudera Streams Messaging (Kafka, SMM) Cloudera Stream Processing (Flink, SQL Stream Builder - SSB) Kudu & Impala ecosystem ✅ Apache NiFi (Advanced) Complex flow design (production-grade pipelines) Processors: QueryDatabaseTable, GenerateTableFetch, MergeRecord Record-based processing & Schema Registry JDBC / DBCP Controller Services Stateful processors & incremental ingestion NiFi → Snowflake integration NiFi → Kudu ingestion patterns ✅ Apache Kafka Kafka architecture (brokers, partitions, replication, retention) Consumer groups & offset management Schema Registry (Avro/JSON) Topic design for high-throughput streaming ✅ Apache Flink Flink SQL & DataStream API Event-time processing, watermarks, windowing Checkpointing, savepoints, state management Kafka source & sink connectors Exactly-once processing semantics Flink CDC (preferred) ✅ Apache Kudu Table design (primary keys, partitioning strategies) Upserts, deletes, merge operations Integration with Impala ✅ Streaming & CDC Change Data Capture (CDC) using: NiFi Flink CDC SQL Stream Builder (SSB) Handling: Late-arriving data Deletes & updates Schema evolution Incremental key tracking 🔹 Core Engineering Skills Strong SQL & query optimization Distributed systems fundamentals Real-time data pipeline architecture High-throughput streaming design 🔹 Experience Requirements
Job Description
Lead Cloudera Consultant
Chicago, IL
Job Description:
You must have hands-on, production-grade experience with ALL of the following:
Cloudera CDP / CDF
- CDP Public Cloud or Private Cloud Base
- Cloudera Flow Management (NiFi + NiFi Registry)
- Cloudera Streams Messaging (Kafka, SMM)
- Cloudera Stream Processing (Flink, SSB)
- Kudu / Impala ecosystem
Apache NiFi (Advanced)
- Building complex flows (not just admin/ops)
- QueryDatabaseTable / GenerateTableFetch / Merge Record
- Record-based processors & schema registry
- JDBC / DBCP controller services
- Stateful processors & incremental ingestion
- NiFi → Snowflake integration
- NiFi → Kudu ingestion patterns
Apache Kafka
- Kafka brokers, partitions, retention, replication, consumer groups
- Schema registry (Avro/JSON)
- Designing topics for high-throughput streaming
Apache Flink
- Flink SQL + DataStream API
- Event-time processing, watermarks, windows
- Checkpointing, save points, state backends
- Kafka source/sink connectors
- Exactly-once semantics
- Flink CDC a plus
Apache Kudu
- Table design (PKs, partition strategies)
- Upserts, deletes, merge semantics
- Integration with Impala
SQL Stream Builder (SSB)
- Creating jobs, connectors, materialized views
- Deploying and monitoring Flink SQL jobs in CDP
CDC (Change Data Capture)
- CDC via NiFi or Flink CDC or SSB
- Handling late-arriving events
- Handling deletes, updates, schema evolution
- Incremental key tracking
General Requirements
- 11+ years in data engineering / streaming
- 3–5+ years specifically with CDP/CDF streaming
- Strong SQL and distributed system fundamentals
- Experience in financial services, healthcare, telecom, or other high-volume industries preferred
Nice to Have
- Kubernetes experience running NiFi/Kafka/Flink operators
- Snowflake ingestion patterns (staging, Copy Into)
- Experience with Debezium
- CI/CD for data pipelines
- Security (Kerberos, Ranger, Atlas)
Similar Jobs
Oracle Generalist: Oracle HCM Cloud Platform & Configuration
Pennsylvania
Apr 7th, 2026
Solution/Cloud Architect
Virginia
Apr 7th, 2026
SAP Fi Treasury Consultant
New Jersey, Illinois
Apr 6th, 2026
Cloud & Infrastructure Platform Engineer
Remote
Apr 6th, 2026
Workday QA Consultant
Michigan
Apr 6th, 2026