At H1, we believe access to the best healthcare information is a basic human right. Our mission is to provide a platform that can optimally inform every doctor interaction globally. This promotes health equity and builds needed trust in healthcare systems. To accomplish this our teams harness the power of data and AI-technology to unlock groundbreaking medical insights and convert those insights into actions that result in optimal patient outcomes and accelerates an equitable and inclusive drug development lifecycle. Visit
h1.co to learn more about us.
Data Engineering has teams which are responsible for collecting, curating, normalizing and matching data from hundreds of disparate sources from around the globe. Data sources include scientific publications, clinical trials, conference presentations and claims among others. In addition to developing the necessary data pipelines to keep every piece of information updated in real-time and provide the users with relevant insights, the teams are also building automated, scalable and low-latency systems for the recognition and linking of various types of entities, such as linking researchers and physicians to their scholarly research and clinical trials. As we rapidly expand the markets we serve and the breadth and depth of data we want to collect for our customers, the team must grow and scale to meet that demand.
WHAT YOU’LL DO AT H1:
As a Senior Engineer on the Data Engineering team, you will work alongside a multi-disciplinary team of software engineers, machine learning engineers, product managers, front-end engineers, and designers. You will work on utilizing and/or adapting various types of algorithms to solve challenging business problems in a variety of areas including entity recognition and resolution, natural language understanding, knowledge graphs and information systems. You will also design novel experiments and create implementations to enable model integration into the production stack. Much needs to be built and quickly; so you will need to have a good understanding of system design and an ability to build quickly and iterate.
You will:
- Build data-based software products that use large amounts of data of various types – structured and unstructured; numeric, text, and graph. These software products will be responsible for ingesting, cleaning, transforming and efficiently storing data.
- Build relevant data processing capabilities and optimization workflows to support large-scale learning from such multi-modal data.
- Develop scalable distributed pipelines for processing large amounts of data quickly and efficiently.
- Develop scalable, distributed microservices to serve a large volume of queries and data.
- Develop efficient distributed algorithms for processing and joining large datasets.- Develop signals models for various analytical tasks such as synonyms, topic modeling, similarity, recommendations, search ranking, etc.
- Write simple, modular code that is easy to understand and easy to maintain
- Build quick Proof of Concepts (POC) and take ownership around projects to demonstrate utilization and value, and drive to production-ready solutions.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
ABOUT YOU
You have strong hands-on technical skills including conventional ETL and SQL skills, experience with multiple programming languages like Python, Java or Scala, as well as streaming or other data processing techniques. You are a self-starter with the ability to manage projects through all stages (requirements, design, coding, testing, implementation, and support).
REQUIREMENTS
-5+ years of experience in working with strong engineering teams and deploying products,
- Strong coding skills in Python, Java, Scala or any proficient language of choice and stacks supporting large scale data processing and/or machine learning.
- Experience with Docker/Kubernetes.
- Strong grasp of computer science fundamentals: data structures, algorithmic trade-offs, etc.
- Strong knowledge and understanding of concepts in machine learning is desirable.
- Experience in utilizing ML and deep learning frameworks (e.g., tensorflow), AutoML techniques (e.g. hyperparameter optimization) and large-scale and distribution training and optimization approaches are a plus.
- Should be willing to manage projects through all the stages (requirements, design, coding, testing, implementation, and support).
Not meeting all the requirements but still feel like you’d be a great fit? Tell us how you can contribute to our team in a cover letter!