Senior Machine Learning Ops Engineers
Job Description
Remote is seeking Senior Machine Learning Ops Engineers to join the team at ASD. ML Ops Engineers are involved in setting the overall ML Ops strategy for the organisation as well as the management of complex projects. They will work closely with cross-functional teams including data scientists, engineers and business stakeholders to ensure that ML Ops initiatives are aligned with industry standards and business goals. ASD requires Machine Learning Engineers to be responsible for building and maintaining ASD’s ML Operations platform. This position is well suited to a candidate with strong software engineering or data engineering expertise, who has had exposure to contemporary Machine Learning practices and technologies.
The ideal candidate has experience with all parts of the MLOps lifecycle, including the registration, deployment, and monitoring of operation capabilities. It is expected that you will deliver key platforms and integrations to deliver self-service abilities to Data Scientists, to achieve continuous integration, continuous deployment, continuous training, and continuous monitoring. (LH-02325)
Role Description
The Machine Learning Engineer will perform the following duties and responsibilities:
- Design, develop, and maintain production MLOps platforms specific to ASD.
- Deploy, monitor, and troubleshoot ML models in production environments.
- Design and implement MLOps pipelines for deploying ML models to production.
- Review and optimise production ML code.
- Work with open-source technology and modern computing infrastructure.
- Work with other engineers to ensure successful integration into enterprise software.
- Work with data scientists to ensure that ML models are well tested and reliable.
Essential:
- DENG 4 (Data Engineering)
- MLNG 5 (Machine Learning)
- RELM 4 (Release Management)
- SINT 4 (Systems Integration and Build)
Desirable:
Experience in one or more of the following areas:
- Software development with Python.
- MLOps tooling such as MLFlow, Ray, Flyte, KubeFlow, Kserve or an enterprise ML platform.
- Building DevOps pipelines with GitLab CI/CD.
- Kubernetes, OpenShift, Docker.
- Data engineering and data wrangling.
- Optimisation of models with one of the following high performance frameworks: HuggingFace Candle, OpenVino, TensorRT, FasterTransformers.
- Working with C++, Rust, or Go, specifically for production ML systems.