Staff Data Scientist (Core Platform)
San Jose, CA
Full Time
Mid Level
Prealize Health
Staff Data Scientist (Core Platform)
About Prealize Health
Prealize Health is a predictive analytics company that leverages machine learning and clinical expertise to help patients obtain better care, sooner. Most healthcare today is reactive – care is delivered when someone is already ill. We believe healthcare should be about keeping individuals and their families well to prevent them from ever becoming sick.Building on years of published research from our founders at Stanford University, our mission is to provide patients, providers, and payers with the insights they need to improve health outcomes and prevent adverse medical events. We are committed to helping healthcare organizations "see around the corner" to take a proactive approach to wellness.
The Role
As a Staff Data Scientist focusing on Foundation Models, you will lead the development of our next-generation patient trajectory and risk prediction systems. You will serve as one of the technical leads for our custom transformer-based architectures, bridging the gap between state-of-the-art research in self-supervised learning and real-world healthcare applications.This is a strategic, high-impact role where you will drive the evolution of our custom healthcare foundation model, shaping how we process millions of claims, lab results, and EHR records to influence the health trajectory of millions of patients.
Key Responsibilities
- Domain Expertise: Drive the end-to-end building and execution of our custom healthcare foundation models, translating high-level clinical use cases into concrete deep learning architectures and training objectives.
- Strategic Vision: Set the technical roadmap for patient risk prediction and health trajectory modeling, pioneering the use of transformer-based architectures in the healthcare domain.
- Methodological Excellence: Establish best practices for deep learning pipelines, including self-supervised pre-training, fine-tuning paradigms, and rigorous evaluation of longitudinal healthcare data.
- Technical Leadership: Own the full ML lifecycle—from data processing and research prototyping to production deployment—while mentoring junior data scientists in modern engineering practices.
- Cross-Functional Collaboration: Partner with clinicians to encode medical domain knowledge into model architectures and work with Engineering to productionize models with high reliability and low latency.
- Platform Innovation: Experiment with novel architectures and representation learning strategies to ensure our platform remains at the forefront of AI-driven healthcare insights.
- External Evangelism: Contribute to research initiatives and represent Prealize Health’s technical expertise in the broader machine learning and healthcare data science community.
Required Qualifications
- Education: PhD and/or MS in Computer Science, Machine Learning, Statistics, or a related quantitative field.
- Experience: 6–8+ years of experience (with 4+ years specifically building and deploying ML systems in production) with a proven track record of technical leadership.
- Deep Learning Expertise: Mastery of transformer architectures, attention mechanisms, and pre-training/fine-tuning paradigms. Hands-on experience with PyTorch or TensorFlow is mandatory.
- Programming & AI Tooling: Expert proficiency in Python and distributed computing (PySpark/Spark/SQL) for large-scale data processing.
- Proficiency in leveraging AI-assisted coding tools (e.g., Claude Code, Cursor, Codex) to accelerate development cycles and enhance code quality.
- Software Engineering Rigor: Strong skills in software design patterns, testing frameworks, CI/CD, and code quality practices.
- Strategic Mindset: Demonstrated ability to conduct independent research and translate complex findings into production systems that solve high-ambiguity problems.
- Communication: Exceptional ability to distill complex technical strategies and research findings for executive stakeholders and cross-functional teams.
Preferred Qualifications
- Healthcare Domain: Experience with large-scale structured healthcare data (Claims, ICD/CPT codes, EHR systems like Epic/Cerner).
- Advanced MLOps: Experience with MLOps tooling such as MLflow, Weights & Biases, and cloud platforms (AWS preferred).
- Specialized Modeling: Familiarity with causal inference, longitudinal modeling, or self-supervised representation learning.
We Offer:
- Flexible work environment
- Competitive base salary plus a generous bonus and equity plan
- Paid time off including holidays
- Medical, dental, vision
- 401k
- Wellness and home office benefits, and more
Pay Transparency:
The target salary range is $180,000 to $220,000 annually. Base pay offered may vary within the posted range based on several factors, including but not limited to education, job-related knowledge, skills, experience, and location.Diversity, Equity & Inclusion:
Prealize embraces diversity and equal opportunity in a serious way. We are committed to building a team that unites a variety of backgrounds, perspectives, and skills. The more inclusive we are, the greater our impact will be.Apply for this position
Required*