Cloud Data Engineer

Key Points



The Ambit Group

  • Designed and implemented data infrastructure, data pipelines, and data products in cloud environments with a focus on Amazon Web Services (AWS).
  • Wrote Cloudformation code to deploy AWS resources using automated tools
  • Provided cloud engineering knowledge to help drive technical and architectural decisions
  • Migrated existing on-premise workloads and applications to the cloud, redesigning as necessary to adapt to cloudnative technologies
  • Adapt database workloads and infrastructure to a cloudsetting
  • Created program written in R that analyzed millions of records and flagged non-compliance accounts. Used latest Data Science and AI technologies and principles to accelerate performance and efficiency, which resulted in 98% times faster solution compared to previous program.
  • Deployed Databricks both PVC legacy and SaaS versions in AWS through CloudFormation and API calls to dev, stage, test and production environments. Used parent CloudFormation templates to integrate all required resources for Databricks stacks.
  • Created AWS pipeline and made use of Code Deploy to fully automate CICD for Databricks deployment.
  • Designed flow and logic of data storage in S3 buckets from extraction, transformation and delivery.
  • Creation of bash scripts to automate AWS task and code updates.
  • Design and implement data infrastructure, data pipelines, and data products in AWS cloud environment.
  • Deployed Airflow to AWS EKS cluster through AWS CloudFormation. Used AWS-cli, Kubectl and Helm.
  • Provide cloud engineering knowledge to help drive technical and architectural decisions.
  • Migration of existing on-premise workloads and applications to the cloud, redesigning as necessary to adapt to cloud-native technologies.
  • Provide leadership in data and application migration methodologies and techniques.
  • Adapt database workloads and infrastructure to a cloud setting.
  • Use agile methodologies and DevOps best practices and maintain code and documentation in both enterprise and public GitHub repositories


  • CaliberMind

  • Helped design and build data engineering solutions to deliver pipeline patterns using Google Cloud Platform (GCP) services: BigQuery, DataFlow, Pub/Sub, BigTable, Data Fusion, DataProc.
  • Supported existing GCP Data Management implementations.
  • Converted jobs written in SQL queries to Pyspark.
  • Built and update Spark-based jobs customizing customer’s marketing data models.
  • Optimized poor performance pipelines written in Pyspark.
  • Write Pyspark unit tests for company’s jobs used in pipeline.