
Closed
Posted
Paid on delivery
We are seeking an experienced Data Engineer to build and maintain scalable, high-performance data pipelines and infrastructure for our next-generation data platform. The platform ingests and processes real-time and historical data from diverse industrial sources such as airport systems, sensors, cameras, and APIs. You will work closely with AI/ML engineers, data scientists, and DevOps to enable reliable analytics, forecasting, and anomaly detection use cases. Shape Major Skills : Spark, Flink, Iceberg Key Responsibilities · Design and implement real-time (Kafka, Spark/Flink) and batch (Airflow, Spark) pipelines for high-throughput data ingestion, processing, and transformation. · Develop data models and manage data lakes and warehouses (Delta Lake, Iceberg, etc) to support both analytical and ML workloads. · Integrate data from diverse sources: IoT sensors, databases (SQL/NoSQL), REST APIs, and flat files. · Ensure pipeline scalability, observability, and data quality through monitoring, alerting, validation, and lineage tracking. · Collaborate with AI/ML teams to provision clean and ML-ready datasets for training and inference. · Deploy, optimize, and manage pipelines and data infrastructure across on-premise and hybrid environments. · Participate in architectural decisions to ensure resilient, cost-effective, and secure data flows. · Contribute to infrastructure-as-code and automation for data deployment using Terraform, Ansible, or similar tools. Shape Qualifications & Required Skills · Bachelor’s or Master’s in Computer Science, Engineering, or related field. · 5+ years in data engineering roles, with at least 2 years handling real-time or streaming pipelines. · Strong programming skills in Python/Java and SQL. · Experience with Apache Kafka, Apache Spark, or Apache Flink for real-time and batch processing. · Hands-on with Airflow, dbt, or other orchestration tools. · Familiarity with data modeling (OLAP/OLTP), schema evolution, and format handling (Parquet, Avro, ORC). · Experience with hybrid/on-prem and cloud platforms (AWS/GCP/Azure) deployments. · Proficient in working with data lakes/warehouses like Snowflake, BigQuery, Redshift, or Delta Lake. · Knowledge of DevOps practices, Docker/Kubernetes, Terraform or Ansible. · Exposure to data observability, data cataloging, and quality tools (e.g., Great Expectations, OpenMetadata). Shape Good-to-Have · Experience with time-series databases (e.g., InfluxDB, TimescaleDB) and sensor data. · Prior experience in domains such as aviation, manufacturing, or logistics is a plus
Project ID: 40381335
2 proposals
Remote project
Active 24 days ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
2 freelancers are bidding on average ₹2,999,995 INR for this job

Hi, I’m Karthik with 15+ years of experience in data engineering, real-time streaming systems, and large-scale analytics platforms. I have strong hands-on expertise building high-throughput pipelines using Apache Kafka, Apache Spark, and Apache Flink for industrial, IoT, and operational datasets. My experience includes: * Real-time and batch ingestion pipelines * Data lake architectures using Apache Iceberg, Delta Lake, and Parquet * Orchestration with Apache Airflow and dbt * SQL/NoSQL integrations, APIs, sensor feeds, flat files, and time-series data * Schema evolution, lineage, observability, and data quality checks * Hybrid cloud deployments across Amazon Web Services, GCP, and on-prem environments * Infrastructure automation with Docker, Kubernetes, Terraform, and Ansible * ML-ready dataset preparation for forecasting and anomaly detection workflows I have also worked with warehousing platforms such as Snowflake, Redshift, BigQuery, and time-series databases including InfluxDB and TimescaleDB. I focus on scalable, resilient, and cost-efficient architectures with strong monitoring, validation, and operational visibility. Warm Regards, Karthik B Resonite Tech
₹3,999,990 INR in 7 days
4.1
4.1

Dear Client, I read the project description and understand your problem. You need a scalable, high-performance data platform that can handle real-time and batch data from diverse industrial sources. I can help you by designing and implementing reliable data pipelines and infrastructure that support analytics, forecasting, and AI/ML workloads. Why Choose Me? 1. I have strong experience in Python, data processing, and building efficient pipelines, with a solid understanding of real-time and batch architectures using tools like Spark and Kafka. 2. I am skilled in designing clean, scalable systems and working with APIs, databases, and structured/unstructured data sources. 3. I focus on performance, reliability, and clean architecture to ensure your pipelines are maintainable and production-ready. What I Will Deliver: Scalable real-time and batch data pipelines Clean, well-structured data models for analytics and ML Integration with multiple data sources (APIs, databases, sensors) Monitoring, validation, and optimization for performance and reliability Support for deployment in hybrid or cloud environments A Few Questions: What is your current data stack (if any)? Are you prioritizing real-time processing, batch, or both equally? Do you have preferred tools (Spark, Flink, Iceberg) already in use? I’m confident I can contribute effectively to building a robust and scalable data platform. Let’s discuss further. Best regards, Adam Gaafar.
₹2,000,000 INR in 7 days
0.0
0.0

Bengaluru, India
Member since Apr 18, 2026
₹12500-37500 INR
₹12500-37500 INR
₹1500-12500 INR
₹600-1500 INR
$250-750 USD
₹12500-37500 INR
€30-250 EUR
₹750-1250 INR / hour
₹37500-75000 INR
£250-750 GBP
min €36 EUR / hour
₹12500-37500 INR
₹600-1500 INR
₹1500-12500 INR
₹1500-12500 INR
₹400-750 INR / hour
$250-750 USD
$10-30 USD
₹750-1250 INR / hour
$30-250 USD