Refer a friend

Connection matters

Referrer's Information

Referee's Information


file .txt, .doc, .docx, .xlsx, .pdf or .ppt only
Added 4 days ago

LocationHo Chi Minh City

Job typePermanent

Salary95,000,000-110,000,000 (VND)

CategoryInformation Technology

Experience3-5 Years

IndustryInformation Technology



Job summary

The Day-to-Day Activities: ● Deliver high-quality AI infrastructure solutions: You will work with the Machine Leaning Platform team to design and develop the infrastructure to support distributed data processing and model training. You will utilize GitOps to ensure the reproducibility of the system's cloud infrastructure on different Kubernetes clusters. ● Develop observability solutions for Machine Learning pipelines: You will be responsible for developing and integrating monitoring and alerting within Grab’s monitoring stack powered by Datadog, Prometheus, and Grafana. You will also contribute to the creation of runbooks and DevOps guides. The Must-Haves: ● Understand terraform and popular modules like EKS. ● Able to understand complex code bases and analyze dependencies. ● Understanding of Kubernetes and experience in managing large clusters. ● Understanding of core AWS cloud concepts like ec2, autoscaling groups, launch templates, subnets, etc. ● Understand core components like coredns, autoscaler, csi driver, load balancer controllers, service mesh, etc. ● Perform zero down time cluster upgrades for clusters serving critical online traffic and tight SLA batch jobs 

Job Responsibilities

Get to Know the Team:

The ML Platform team empowers teams across the company to harness the power of machine learning. We're building cutting-edge tools and infrastructure to drive innovation and automation throughout company.

Get to Know the Role:

As a DevOps Engineer in the ML Platform team at Adecco‘s client, you will contribute to the creation and maintenance of machine learning infrastructure. This is a role focused on Infra/SRE, embedded within the ML Platform team. You will be responsible for supporting the maintenance, upgrades, and improvements of our infrastructure, as well as providing ongoing support.

The Day-to-Day Activities:

  • Deliver high-quality AI infrastructure solutions: You will work with the Machine Leaning Platform team to design and develop the infrastructure to support distributed data processing and model training. You will utilize GitOps to ensure the reproducibility of the system‘s cloud infrastructure on different Kubernetes clusters. 
  • Develop observability solutions for Machine Learning pipelines: You will be responsible for developing and integrating monitoring and alerting within monitoring stack powered by Datadog, Prometheus, and Grafana. You will also contribute to the creation of runbooks and DevOps guides. 

Experience requirements

  • 5+ years of experinence as a DevOps Engineer.
  • Prior experience in MLOps or related fields is a plus.
  • Understand terraform and popular modules like EKS. 
  • Able to understand complex code bases and analyze dependencies.
  • Understanding of Kubernetes and experience in managing large clusters.
  • Understanding of core AWS cloud concepts like ec2, autoscaling groups, launch templates, subnets, etc. 
  • Understand core components like coredns, autoscaler, csi driver, load balancer controllers, service mesh, etc. 
  • Perform zero down time cluster upgrades for clusters serving critical online traffic and tight SLA batch jobs.
  • Fluent communication in English (work with regional teams)

Education requirements

Relevant degrees in Computer Science, Software Engineering, or a related field


For more information, contact PIC via hieungan.nguyen@adecco.com or (+84) 389 910 169 (Phone/Zalo/WhatsApp)

Contact Person

  • Hieu Ngan Nguyen
  • Adecco
  • Tel.