Senior Data Scientist – Surfaceome Biology
Senior Data Scientist – Surfaceome Biology
India - Hyderabad Apply NowWhat you will do
Let’s do this. Let’s change the world. We are seeking a highly qualified and motivated Senior Data Scientist with a strong background in computational biology to join the Bioinformatics Technologies team within Amgen’s Automation, Research Data Systems, Informatics, and AI (ARIA) organization. ARIA is a multidisciplinary group embedded within Amgen’s discovery engine, leveraging advancements in digital technologies for disease modeling and digital modality engineering to accelerate the pipeline from target inception through drug development. Within ARIA, Bioinformatics Technologies serves as an innovation hub for developing, deploying, and applying emerging digital technologies in computational biology to drive the next generation of therapeutic discovery.
Cell surface proteins represent a critical interface between cells and their environment, and the systematic characterization of the surfaceome is essential for therapeutic discovery. By integrating diverse experimental and computational approaches, surfaceome analysis enables the identification of novel antigens, quantification of protein density and heterogeneity, assessment of internalization potential, and mapping of cell type–specific signatures. These insights are pivotal for advancing the next generation of targeted and multispecific drug modalities. In this role, you will develop and apply advanced computational and AI/ML methods to analyze and model surfaceome data, build scalable workflows and atlases, and collaborate with experimental teams to ensure computational predictions translate into actionable insights for therapeutic discovery. The successful candidate will possess strong analytical aptitude, deep technical expertise in computational sciences and AI/ML, and a solid understanding of molecular and cellular biology, proven by a track record of innovative and collaborative research.
RESPONSIBILITIES:
- Develop novel AI/ML, computational, and data science methods – including agentic AI systems – for processing, integrating, and analyzing multi-modal datasets (e.g., transcriptomics, proteomics, imaging, and single-cell multi-omics) to identify and prioritize novel and target-tissue specific cell surface antigens.
- Build workflows to harmonize and integrate public and internal datasets for constructing comprehensive multi-modal surfaceome atlases, and model protein expression heterogeneity across cell types, states, and spatial contexts.
- Design computational tools and predictive models for protein localization, topology, and internalization behavior to prioritize candidate surface proteins or antigen–ligand pairs for the discovery of multispecific antibody modalities and therapeutic payload delivery.
- Apply generative modeling and representation learning to propose novel antigen candidates or antigen combinations with desirable therapeutic attributes, such as accessibility, specificity, and internalization potential.
- Collaborate with experimental and translational teams to design and guide validation experiments, ensuring computational predictions translate into actionable biological insights.
What we expect of you
We are all different, yet we all use our unique contributions to serve patients. The dynamic professional we seek is a senior data scientist with these qualifications.
Basic Qualifications:
- Any degree and 8-13 years of directly related experience
Preferred Qualifications:
- Demonstrated expertise in computational and data science method development for short- and long-read transcriptomics, polysome sequencing, proteomics, multi-omics integration, or surfaceome characterization.
- Strong background in machine learning and AI, including deep learning and generative modeling; experience with pre-training, fine-tuning, and few-/zero-shot learning, ideally experience with in silico modeling of cellular surface protein biology.
- Proven track record of applying computational methods to drive biological insights, target validation, biomarker discovery, or therapeutic hypothesis generation.
- Proficiency in scientific programming languages and tool development using Python, R, or similar, with familiarity in relevant libraries and frameworks.
- Experience with large-scale data processing using cloud computing, workflow development, and software best practices (e.g., version control, continuous integration, test-driven development).
- Familiarity with agentic AI, digital innovation approaches, and FAIR data principles for building robust and scalable analytical workflows.
- Familiarity with molecular and disease biology, with the ability to contextualize computational findings in therapeutic discovery.
- Excellent analytical and communication skills, with the ability to extract and clearly present insights from complex data to diverse audiences with rigor and accuracy.
- Strong interpersonal and collaborative skills with demonstrated ability to thrive in cross-functional teams and effectively present results to diverse audiences.
- Creative, open-minded, and passionate about research, with a proven record of innovative algorithm and model development demonstrated through impactful publications, patents, or widely adopted tools.