Thrive Career Wellness Platform

DevOps/Machine Learning Engineer

remote
Toronto, Ontario, Canada .
full-time . March 19, 2025

Description

Dev / ML Ops

Base Salary: $120k-$150k
Remote within Canada

Job Overview:
We are seeking a highly skilled DevOps Engineer to manage and optimize our AWS cloud infrastructure while supporting ML Ops initiatives. This role will focus on ensuring our cloud systems are secure, scalable, and efficient, while also enabling seamless deployment and operation of machine learning workflows.

Key Responsibilities:
  • Cloud Infrastructure Management: Design, implement, and maintain robust, scalable, and cost-efficient cloud solutions on AWS.
  • Automation & CI/CD: Build and maintain CI/CD pipelines to automate infrastructure provisioning, application deployments, and system monitoring.
  • Monitoring & Optimization: Develop monitoring solutions to ensure performance, reliability, and cost-effectiveness of cloud infrastructure.
  • Security: Implement cloud security best practices, including IAM, network configurations, and encryption strategies.
  • ML Ops Support: Collaborate with AI team and engineers to operationalize machine learning models, ensuring smooth integration into production systems.
  • Containerization & Orchestration: Use tools like Docker to containerize applications and manage clusters effectively.
  • Collaboration: Partner with software developers, data engineers, and other stakeholders to streamline workflows and ensure infrastructure aligns with business needs.
  • Documentation: Maintain comprehensive documentation of infrastructure, processes, and best practices for internal use and onboarding.

Qualifications:

  • Experience: 3+ years of experience in DevOps or a related role, with exposure to ML Ops workflows.
  • Technical Skills:
  • Expertise in AWS services (e.g., EC2, S3, Lambda, EKS, SageMaker).
  • Proficiency in Infrastructure as Code (IaC) tools such as Terraform or CloudFormation.
  • Hands-on experience with CI/CD tools like GitHub Actions, or GitLab CI/CD.
  • Strong skills in containerization (Docker) and orchestration (Kubernetes).
  • Proficiency in scripting languages such as Python, Bash, or PowerShell.
  • ML Ops Knowledge: Familiarity with SageMaker, Kubeflow, MLflow, or equivalent tools for machine learning operations.
  • Monitoring Tools: Experience with observability tools like CloudWatch, Prometheus, Grafana, or similar.
  • Problem-Solving: Strong troubleshooting skills for cloud and system-related issues.
  • Communication: Clear and effective communication skills to collaborate across technical and non-technical teams.

Nice-to-Have:
  • Experience managing data engineering workflows or working with platforms like Databricks.
  • Knowledge of serverless architecture and event-driven systems.
  • Familiarity with cloud cost management tools and strategies.
  • Exposure to advanced security compliance frameworks and practices.
  • Familiarity with Ruby on Rails
 
Apply Now:

Join a team that values diversity and inclusivity. Thrive Career Wellness is proud to be an Equal Opportunity Employer. Should you require accommodation during the hiring process, please let us know. Applicants must be legally entitled to work in Canada.

Step into a role that empowers you to be at the forefront of career wellness innovation. Apply today and join us in making careers thrive!




Compensation

$120,000.00 - $149,969.00 per year

Know someone who would be a perfect fit? Let them know!