Data Engineer
Data Engineer
India - Hyderabad Apply NowAs a Data Engineer supporting Law data strategy, you will design, build, and maintain scalable data pipelines that integrate data from legal systems into Amgen’s enterprise data fabric.
You will enable high-quality, governed datasets that support analytics, reporting, and emerging AI/ML use cases for Legal and Compliance teams.
This role requires strong hands-on engineering skills, familiarity with modern data platforms (e.g., Databricks), and the ability to work closely with Legal stakeholders, Data Architects, and AI/Analytics teams.
Key Responsibilities
Data Engineering & Pipeline Development
- Design, develop, and maintain data pipelines to ingest data from legal systems, third-party tools, and enterprise platforms
- Build and optimize ETL/ELT pipelines using modern frameworks (Databricks, Spark)
- Implement reliable, scalable, and production-ready data pipelines using engineering best practices, monitoring, and automated validation frameworks
- Integrate structured and unstructured legal data into the enterprise data fabric
- Ensure reliability, scalability, and performance of data pipelines
Databricks & Modern Data Platform
- Develop pipelines using Databricks (Delta Lake, Spark, notebooks)
- Implement data transformation and orchestration workflows
- Support migration and modernization of legacy data solutions to cloud-native platforms
- Contribute to reusable data engineering patterns and components
- Optimize Delta Lake and Spark workloads for scalable, cost-efficient, and high-performance enterprise data processing
Data Quality, Governance & Compliance
- Implement data quality checks, validation rules, and monitoring
- Implement governance, lineage, and security controls for sensitive legal and compliance datasets
- Ensure compliance with data governance, privacy, and legal/regulatory requirements (e.g., sensitive legal data handling)
- Maintain metadata, lineage, and documentation for legal datasets
AI & Advanced Analytics Enablement
- Build curated datasets that support AI/ML models and GenAI use cases
- Prepare structured and unstructured datasets for AI/ML and GenAI use cases including document intelligence and semantic search applications
- Enable feature engineering and data preparation for AI applications in Legal (e.g., document analysis, contract insights)
- Collaborate with data scientists and AI teams to ensure data readiness and accessibility
Collaboration & Delivery
- Work with Legal stakeholders to understand data needs and translate into technical solutions
- Partner with Data Architects to align with enterprise data fabric strategy
- Participate in Agile development processes (sprint planning, estimation, delivery)
- Document pipelines, models, and technical decisions
Basic Qualifications
- Master's or Bachelor’s degree in Computer Science, Engineering, Information Systems, or related field
- 5–8 years of experience in data engineering or related technical role
Must-Have Technical Skills
- Strong experience with SQL and relational databases
- Programming experience in Python (required), PySpark preferred
- Hands-on experience with Databricks / Apache Spark
- Experience building ETL/ELT pipelines for large-scale datasets
- Familiarity with cloud platforms (AWS, Azure, or GCP)
- Understanding of data modeling and data warehousing concepts
Preferred / Strategic Skills (Aligned to Future Data Strategy)
- Certification:
- Relevant certifications in Databricks, cloud platforms (AWS/Azure/GCP), or modern data engineering technologies are a plus
- Experience with:
- Delta Lake / Lakehouse architectures
- Data Fabric / Data Mesh concepts
- Snowflake, Redshift, or enterprise data warehouse platforms
- Familiarity with:
- Streaming data (Kafka, event-driven pipelines)
- Data orchestration tools (Airflow, Databricks Workflows)
- Exposure to:
- AI/ML data pipelines and feature engineering
- Unstructured data processing (documents, legal text)
- Understanding of:
- Data governance frameworks and cataloging tools
- Security and privacy controls for sensitive data (legal/compliance)
Functional Skills
- Strong problem-solving and analytical thinking
- Ability to work with large, complex datasets
- Effective communication with both technical and non-technical stakeholders
- Ability to operate in a fast-paced Agile environment