
Mohammad Asif Shaik
About Candidate
Worked on designing and developing batch and streaming data pipelines using PySpark, Databricks, and Delta Lake. Implemented Bronze–Silver–Gold lakehouse architecture, processed streaming data using AWS Kinesis, optimized Spark transformations for performance improvement, applied data quality validations, and transformed large datasets using SQL and BigQuery to support analytics and reporting.
Location
Education
Graduated with a CGPA of 8.16. Completed a comprehensive program focused on computer science fundamentals, programming, data engineering concepts, and software development methodologies.
Work & Experience
Designed and developed 5+ batch and streaming data pipelines using PySpark, Databricks, and Delta Lake following Bronze–Silver–Gold architecture. Ingested and processed 10K+ streaming JSON records per run using AWS Kinesis with micro-batch processing. Optimized Spark transformations (joins, aggregations, partitioning), improving pipeline performance by 25%. Implemented data quality checks, schema validation, and error handling, reducing data inconsistencies by 30%. Modeled and transformed 100K+ records using SQL and BigQuery to support analytics and reporting use cases.
