See all roles

[Remote] Senior Cloud Operations Engineer

Work from home Full-time role Hiring

Note: The job is a remote job and is open to candidates in USA. The Linux Foundation is a driving force in fostering open source collaboration and supporting communities across a range of projects, including PyTorch. They are seeking a Senior Cloud Operations Engineer who will focus on the infrastructure operations of the PyTorch project, automating processes, optimizing cloud-native tools, and ensuring a robust and scalable cloud environment.

Responsibilities

  • Manage multi-cloud environments, primarily focusing on AWS services (EKS, EC2, S3, IAM, ELB)
  • Contribute to architectural exercises with open source community and technical leads to validate new cloud infrastructure
  • Implement and maintain infrastructure-as-code using Terraform via pytorch/ci-infra and pytorch/test-infra
  • Optimize cloud resource utilization and implement FinOps practices for cost management and reporting
  • Design, implement, and maintain CI/CD pipelines using GitHub Actions and ARC, including runner configurations and other elements of the CI ecosystem
  • Debug and triage issues in build and test pipelines, including experience with unit testing
  • Develop monitoring and alerting solutions for CI/CD workflows and critical infrastructure
  • Manage and optimize Cloudflare CDN deployments for PyTorch assets (R2/S3)
  • Implement best practices for CDN and overall infrastructure security
  • Develop comprehensive monitoring and observability solutions using Datadog, AWS CloudWatch, and other telemetry data collection and processing tools
  • Review and recommend monitoring solutions as project and community needs evolve
  • Participate in on-call rotations supporting operations and incident response using incident.io
  • Establish and maintain escalation procedures and resolution processes
  • Participate in ci-infra and multi-cloud working groups and support architecture decisions
  • Collaborate with external contributors and promote DevOps best practices
  • Manage GitHub repositories, including user onboarding and access control
  • Attend and contribute to technical meetings, including Infrastructure, CI Workflow, and Technical Advisory Council sessions
  • Develop and maintain technical documentation for infrastructure and processes
  • Provide guidance on developer best practices and tooling
  • Create and update runbooks for common operational tasks and incident response

Skills

  • Ability to work with communities made up of industry specialists and collaborate outside of the Linux Foundation
  • Bachelor's degree in Computer Science, Engineering, or related field
  • 7+ years of experience in cloud operations with significant AWS expertise
  • Strong knowledge of infrastructure-as-code principles and tools, particularly Terraform
  • Proficiency in scripting languages (Python, TypeScript, Bash) and containerization technologies (Docker, Kubernetes)
  • Experience with Cloudflare CDN management and optimization
  • Expertise in implementing and managing monitoring solutions, specifically Datadog and AWS CloudWatch
  • Familiarity with incident management tools and processes, particularly incident.io
  • Demonstrated experience in CI/CD pipeline design and implementation
  • Strong problem-solving skills and ability to troubleshoot complex systems
  • Excellent communication skills and experience collaborating with open source communities
  • Experience with PyTorch or other open source communities
  • Multi-cloud expertise across AWS, GCP, and Azure
  • GitHub ARC experience
  • Knowledge of FinOps principles and cloud cost optimization strategies
  • Contributions to open source projects, especially in infrastructure management roles
  • Familiarity with the Linux Foundation or similar open source foundations
  • Experience mentoring other engineers and fostering a collaborative team environment

Benefits

  • The Linux Foundation maintains a predominantly remote workforce
  • Committed to hiring top-notch talent
  • Providing a flexible and supportive work culture
  • Collaboration is embedded in our DNA
  • Work closely together while not being confined to a traditional office space

Company Overview

  • The Linux Foundation is the organization of choice for the world's top developers and companies to build ecosystems that accelerate open technology development and commercial adoption. It was founded in 2000, and is headquartered in San Francisco, California, USA, with a workforce of 201-500 employees. Its website is http://www.linuxfoundation.org.
  • Apply To This Job

    You might like

    [Remote] Director - Product Management (Digital Experience Team)

    Work from home Full-time role

    [Remote] Customer Success Manager | Mid-Market

    Work from home Full-time role

    [Remote] Technical Account Manager

    Work from home Full-time role

    [Remote] Senior Account Manager (Remote)

    Work from home Full-time role

    [Remote] Senior Data Scientist (SEO & AI)

    Work from home Full-time role

    [Remote] Pharmacy Account Executive

    Work from home Full-time role

    [Remote] Senior Physical Design Engineer

    Work from home Full-time role

    [Remote] Program Manager - Experience Design and Development

    Work from home Full-time role

    [Remote] Lead Analyst, Analytics

    Work from home Full-time role

    [Remote] Pharmacy Consultant, PharmD

    Work from home Full-time role

    Apple Online Support Specialist at Tech Solutions Inc.

    Work from home Full-time role

    Experienced Call Center / Customer Service Representative – Remote Opportunity in Columbus Area

    Work from home Full-time role

    Experienced Live Chat Specialist – Deliver Exceptional Customer Service Experience Remotely

    Work from home Full-time role

    Content Strategist-Remote

    Work from home Full-time role

    VIRTUAL and FLEXIBLE | K-12 ELA/Math SPED or ELL Certified Teacher Tutor

    Work from home Full-time role

    Food Service Worker

    Work from home Full-time role

    Sr Ancillary Product Consultant

    Work from home Full-time role

    Drop-shipping VA (Walmart Account)

    Work from home Full-time role

    [Work From Home] No Phones Work from Home Jobs | Focused Quiet

    Work from home Full-time role

    [Remote] Operations and PMO Leaders - Subject Matter Expert - Dallas, US

    Work from home Full-time role