Data Engineer
Data Engineer
India - Hyderabad Apply Now
JOB ID:
R-239629
País:
India - Hyderabad
Estado:
On Site
DATE POSTED:
Mar. 26, 2026
CATEGORÍA DE EMPLEO:
Information Systems
Role Summary
- Build and operate large-scale healthcare data pipelines across batch workflows, metadata-driven ingestion, and data service publishing.
- Own end-to-end engineering from source ingestion to conformed data products, with strong focus on reliability, data quality, and operational observability.
- Partner with analytics, business, and platform teams to deliver trusted datasets for sales, claims, activity, patient, and rare disease use cases.
Key Responsibilities
- Design and maintain PySpark/SQL pipelines in Databricks for landing, unified, unstitched, and published data layers.
- Build and support Airflow DAGs for scheduling, dependencies, retries, and production operations.
- Implement metadata/config-driven frameworks for ingestion, transformation, and rule-based processing.
- Develop robust data quality controls, DQ summaries, failure handling, and alerting workflows.
- Manage batch/process audit logs, run status tracking, release flags, and operational reporting.
- Integrate multi-source data (files, APIs, cloud storage, and relational systems) into governed Delta/Spark tables.
- Optimize pipeline performance using partitioning, parallelization, and query tuning.
- Collaborate on schema evolution, business-rule onboarding, and production support.
Required Skills
- Bachelor’s degree in Computer Science, Information Technology, or a related field with 5-9 years of experience
- Advanced Python, PySpark, and SQL (window functions, complex joins, MERGE patterns, optimization).
- Hands-on Databricks and Airflow experience in enterprise environments.
- Experience with cloud data platforms (AWS), object storage, and secure secret handling.
- Strong data quality engineering, monitoring, and troubleshooting in regulated data contexts.
- Solid understanding of ETL orchestration, dependency management, and SLA-driven delivery.