Sr. Data Engineer – Clinical Data Hub
Sr. Data Engineer – Clinical Data Hub
India - Hyderabad Apply NowABOUT THE ROLE
Clinical Data Hub
Amgen’s Clinical Data Hub (CDH) is a Information Technology product team chartered to identify, design and implement technology that powers Amgen’s end‑to‑end drug development lifecycle. We are at an inflection point, accelerating through rapid, AI‑driven modernization to build clinical data products that enable both drug regulatory submissions and drug discovery and development. If you’re passionate about turning complex clinical data into resilient, scalable products that help speed life‑changing medicines to patients worldwide, this is a once‑in‑a‑decade opportunity to do your career‑best work on a global stage.
Role Description:
he role is responsible for designing, building, maintaining, analyzing, and interpreting data to provide actionable insights that drive business decisions. This role involves working with large datasets, developing reports, supporting and executing data governance initiatives and, visualizing data to ensure data is accessible, reliable, and efficiently managed. The ideal candidate has strong technical skills, experience with big data technologies, and a deep understanding of data architecture and ETL processes
Roles & Responsibilities:
Design, develop, and maintain data solutions for data generation, collection, and processing
Be a key team member that assists in design and development of the data pipeline
Create data pipelines and ensure data quality by implementing ETL processes to migrate and deploy data across systems
Contribute to the design, development, and implementation of data pipelines, ETL/ELT processes, and data integration solutions
Take ownership of data pipeline projects from inception to deployment, manage scope, timelines, and risks
Collaborate with cross-functional teams to understand data requirements and design solutions that meet business needs
Develop and maintain data models, data dictionaries, and other documentation to ensure data accuracy and consistency
Implement data security and privacy measures to protect sensitive data
Leverage cloud platforms (AWS preferred) to build scalable and efficient data solutions
Collaborate with Data Architects, Business SMEs, and Data Scientists to design and develop end-to-end data pipelines to meet fast paced business needs across geographic regions
Identify and resolve complex data-related challenges
Adhere to best practices for coding, testing, and designing reusable code/component
Explore new tools and technologies that will help to improve ETL platform performance
Participate in sprint planning meetings and provide estimations on technical implementation
Collaborate and communicate effectively with product teams
Basic Qualifications and Experience:
Master’s or Bachelor's degree with 8 - 12 years of experience in Computer Science, IT or related field
Functional Skills:
Must-Have Skills:
Hands on experience with big data technologies and platforms, such as Databricks, Apache Spark (PySpark, SparkSQL), workflow orchestration, performance tuning on big data processing
Hands on experience with various Python/R packages for EDA, feature engineering and machine learning model training
Proficiency in data analysis tools (eg. SQL) and experience with data visualization tools
Excellent problem-solving skills and the ability to work with large, complex datasets
Strong understanding of data governance frameworks, tools, and best practices.
Knowledge of data protection regulations and compliance requirements (e.g., GDPR, CCPA)
Good-to-Have Skills:
Experience with ETL tools such as Apache Spark, and various Python packages related to data processing, machine learning model development
Strong understanding of data modeling, data warehousing, and data integration concepts
Knowledge of Python/R, Databricks, SageMaker, cloud data platforms
Clinical development domain knowledge is a plus
Professional Certifications:
Certified Data Engineer / Data Analyst (preferred on Databricks or cloud environments)
Certified Data Scientist (preferred on Databricks or Cloud environments)
Machine Learning Certification (preferred on Databricks or Cloud environments)
SAFe for Teams certification (preferred)
Soft Skills:
Excellent critical-thinking and problem-solving skills
Strong communication and collaboration skills
Demonstrated awareness of how to function in a team setting
Demonstrated presentation skills
Shift Information:
This position requires you to work a later shift and may be assigned a second or third shift schedule. Candidates must be willing and able to work during evening or night shifts, as required based on business requirements.