Drag

Data Engineer

Location : ,

Job Description

JOB ROLE :-    Sr. Data Engineer 

JD 

Must be able to use personal workstation or options are available for rent. They are a Microsoft shop. 

10+ years of experience in data engineering or a related role.

Proficiency in AWS services, specifically S3, Glue, Athena, and Lambda.

Experience with Hadoop ecosystem, including HDFS, MapReduce, and Hive.

Hands-on experience with Apache Hudi for real-time data management.

Strong programming skills in Python and PySpark.

Familiarity with SQL and NoSQL databases.

Knowledge of data governance and data security best practices.

Experience with workflow scheduling tools like AWS Step Functions or Control M.

Design, build, and manage ETL data pipelines using AWS services such as S3, Glue, Athena, and Lambda.

Implement and manage real-time data streaming and batch processing solutions using Apache Hudi, PySpark, and Hadoop.

Leverage AWS Athena for complex SQL queries over large datasets.

Use Python and PySpark to perform data transformation and data cleansing.

Ensure data quality and integrity by implementing proper data governance strategies.

Collaborate with cross-functional teams to meet business objectives.

Monitor performance and advise any necessary infrastructure changes.

Develop technical documentation including data dictionaries, metadata, and pipeline architecture.

Troubleshoot data issues and provide ongoing operational support.