Senior Staff Software Engineer, Data Platform Boston, Massachusetts, United States
Senior Staff Software Engineer, Data Platform
at Valo Health Boston, Massachusetts, United States Job Title: Senior Staff Software Engineer, Data Platform
Location: San Francisco/Boston
Valo Health is a biotechnology company that was created with the belief that drug discovery and development should be faster and less expensive, with a much higher probability of success. To achieve this goal, we are pioneering a novel, fully integrated approach that combines data and machine learning insights at every step of the process. We are a multi-disciplinary team that brings together experts at every phase of software and drug development to create a cohesive platform. Our end goal is to create life-changing medical treatments by combining expertise in technology and life sciences with a comprehensive view of the entire drug discovery and development process. Valo is committed to hiring a world-class team that brings together a wide variety of different skills and experiences. We are committed to inclusion across race, gender, age, religion, identity, and experience, and believe that diversity makes us stronger by bringing in new ideas and perspectives. We strive to create a workplace that cultivates bold innovation through collaboration and empowers our people to unleash their full potential. About The Role Valo is looking for an experienced Sr. Staff Software Engineer to build out our integrated data platform for Opal to deliver regulatory-grade analysis, better train our algorithms and models, and identify unique insights by enabling data fusion across disparate data sets. We are taking on hard engineering problems that are found in few other places throughout industry, so we are looking for engineers with the flexibility and ingenuity to match. Opal is an AI-based platform that leverages human-centric data to enable researchers to discover and develop new drugs. Opal is a fully integrated, AI-powered, cloud-native platform that leverages human-centric data to create new approaches to drug discovery and development enabling researchers to minimize the cost and time associated with discovery, development, and delivery of novel therapeutics. The predictive insights produced by Opal rely on high-quality, high-density human-centric data that is sourced from multiple data sets, processed both remotely and on site through a highly complex process. You will be responsible for designing and implementing features as well as onboarding users to Valo’s data platform. This is a multi-faceted role which includes understanding, designing, coordinating, and implementing features of the platform that relate to: ingestion, normalization, processing, ontologies, data cataloging, security, data isolation, ML model building, discoverability, data linage and reproducibility. Our platform must support all human-centric data relevant to drug discovery and development: everything from preclinical assays to publicly available ‘Omics data to real-world data (RWD) to clinical trial data, and all of it needs to be harmonized, joinable, and reusable to enable our cutting-edge Data Science and Machine Learning. The role will require you to exhibit strong technical judgement and mentorship skills to help shape direction and grow other engineers. Successful candidates will exhibit the following leadership traits:
- Customer Obsession: Mission- and vision-oriented product evangelist. Comfortable with being the face and voice of the product and mission in the market and across the company. Builder mentality. Sees ambiguity as opportunity, and obstacles as chances to build.
- Think Big: Innovative and creative, with a vision that transcends what is visible today.
- Earn Trust: Ability to build credibility and rapport with the executive team to drive collaboration and coordination with key stakeholders. High Emotional Quotient (EQ)with strong communication and influencing skills, to create corporate-wide alignment around product vision.
- Dealing with ambiguity: Comfortable with charting new territories and navigating with imperfect information and considering decisions of trade-offs.
- Bias for Action: High energy, low ego, and focus on finding data-driven solutions. What You’ll Do…
- Lead the definition of platform architecture, including storage designs, pipelines, data APIs and self-service tooling for diverse types of data (structured and unstructured), and diverse workloads/dataflows (transactional, analytics, ML pipelines, research data science)
- Design components of platform architecture, including storage designs, pipelines, data APIs and self-service tooling for diverse types of data (structured and unstructured), and diverse workloads/dataflows (transactional, analytics, ML pipelines, research data science)
- Incorporate engineering excellence daily from establishing requirements, design processes, to code development and robust testing strategies for data pipelines and software features.
- Champion the self-service infrastructure and features of the data platform through well written documentation and partnering with data owners to learn the platform
- Integrate data governance, isolation, and security into all facets of the platform
- Mentor and develop other engineers What You Bring...
- 8+ years of software engineering experience with at least 5 years focused in the data discipline (data systems, data engineering, data governance, and data pipelining)
- Experience building data infrastructure using modern cloud technologies and working with data ecosystem tools: Data Processing (i.e. Spark), Workflow Orchestration (i.e. Airflow), Cloud Datawarehouse/Datalake technologies (i.e. Snowflake, Databricks), Data Observability and Cataloging
- Product mindset for internal customers to champion adoption of data platform infrastructure and technologies
- Development experience in a professional setting using python/Java, SQL/Spark
- A strong understanding of key AWS technologies: s3, glue, EC2, EMR, etc.
- B.S. or M.S. in Computer Science or a similar technical field, or equivalent experience You May Also Bring...
- Experience in drug discovery and development and working with laboratory systems, real world medical or genomic data
- Experience implementing controls for regulatory compliance and data governance applications such as HIPAA, GDPR, 21 CFR Part 11, FDA data submissions using Real World Data, data isolation and use restrictions, etc.
- Experience with translating disperse datasets into unified, coherent data models (entity relationship models, hierarchical ontologies, and/or graph models) More on Valo Valo Health, LLC (“Valo”) is a technology company built to transform the drug discovery and development process using human-centric data and artificial intelligence-driven computation. As a digitally native company, Valo aims to fully integrate human-centric data across the entire drug development life cycle into a single unified architecture, thereby accelerating the discovery and development of life-changing drugs while simultaneously reducing costs, time, and failure rates. The company’s Opal Computational Platform™ is an integrated set of capabilities designed to transform data into valuable insights that may accelerate discoveries and enable Valo to advance a robust pipeline of programs across cardiovascular metabolic renal, oncology, and neurodegenerative disease. Founded by Flagship Pioneering and headquartered in Boston, MA, Valo also has offices in Lexington, MA, San Francisco, CA, Princeton, NJ, and Branford, CT. To learn more, visit . Voluntary Self-Identification For government reporting purposes, we ask candidates to respond to the below self-identification survey. Completion of the form is entirely voluntary. Whatever your decision, it will not be considered in the hiring process or thereafter. Any information that you do provide will be recorded and ma