Cloud Data Developer/Engineer
Day one Onsite role in Nutley NJ
12+ months contract
Job Description:
Required:
- Proven track record of developing, deploying, and supporting data analytic tools and solutions.
- Experience managing and coordinating with IT teams to maintain secure and compliant tools and applications
- Experience with developing and deploying cloud-based tools or distributed computing environment using Spark.
- Excellent communication and presentation skills required
- Experience in managing different workstreams and coordinating tasks with internal teams and outside consultants
- Minimum 8- 10 Years of experience 8-10
- Bachelor’s Degree required with Master’s Degree preferred or equivalent industry experience
Responsibilities:
- Cloud Data Developer / Engineer will work on a team building data store(s), data lake(s), warehouse(s), pipelines, and machine learning models which will be utilized to drive drug development through predictive modeling of disease and drug response.
- Will collaborate closely with data science and biostatisticians in statistics methodology and machine learning to support projects in various stages of development in multiple business groups.
- Cloud Data Developer / Engineer will be part of providing actionable insights while creating mission critical data science projects for our business.
- Technical Skills Required:
- Experience with AWS and all services required for providing data, big data and machine learning solutions
- Experience with big data platforms like Databricks and AWS Services.
- Some experience with cataloging of data in big data scenarios such as data lakes utilizing tools like Databricks Unity Catalog, Informatica Metadata, AWS Glue Catalog.
- Experience with creating data pipelines for big data ingestion and data processing using AWS Services and Informatica Cloud.
- Experience with Python with PySpark, R, and SQL.
- Ability to work with imaging files like DICOMs and how to extract metadata from those images and catalog the data.
- Experience with data cleansing and processing datasets.
- Familiarity with sophisticated deep learning data science tools like Keras, PyTorch, and TensorFlow
- Experience with Analytics tools such as Tableau and QuickSight (Visualization), Posit (formerly RStudio), Databricks.
- Must have deep expertise in ETL/ELT including the creation of destination files like Parquet and ORC.
- Experience with data access controls, privacy and security policies using AWS, and other tools such as Immuta, Databricks Unity Catalog, etc.
- Some experience with Clinical data, Image Viewer, SAS.