See all roles

[Remote] Manager, Site Reliability Engineering

Work from home Full-time role Hiring

Note: The job is a remote job and is open to candidates in USA. Paradigm is a software company transforming the residential, construction & building product industries. They are seeking a Manager of Site Reliability Engineering to lead a high-performing team, promote modern SRE practices, and enhance reliability across their Azure-based platform.

Responsibilities

  • Lead and grow a team of site reliability engineers. Provide guidance, mentorship, and career development
  • Contribute to and mature SRE practices across production services: SLOs, SLIs, error budgets, toil reduction, and blameless post-mortems that turn incidents into lasting improvements
  • Oversee the incident management lifecycle end-to-end including detection, response, resolution, post-incident review, and systemic improvement
  • Design on-call rotations, runbooks, and escalation procedures that balance service reliability with engineer well-being and sustainable work practices
  • Drive measurable reductions in MTTR and MTTD through improved observability, intelligent automation, and predictive monitoring
  • Build automation to eliminate manual operational work including provisioning, deployment, scaling, self-healing, and reporting
  • Implement chaos engineering practices to validate system resilience and surface weaknesses before they cause outages
  • Partner with engineering and product teams to embed reliability requirements into the development lifecycle, from design through deployment
  • Collaborate with the observability team to ensure comprehensive instrumentation, smart alerting, and actionable dashboards across all critical services
  • Measure, report, and advocate for reliability improvements with both technical and executive stakeholders using data to drive investment decisions

Skills

  • Bachelor's degree in Engineering, or a related field or equivalent experience
  • 7+ years in site reliability engineering, DevOps, or infrastructure engineering, with at least 1 year in people management (or demonstrated tech lead experience with direct influence over team processes and career growth)
  • Hands-on experience running production systems on Azure (including proficiency with key services such as AKS, App Services, Service Bus, Event Grid, and Azure Monitor) or comparable cloud platforms
  • Proven track record implementing SRE practices with measurable reliability improvements and familiarity with modern observability platforms (Datadog, Prometheus/Grafana, or equivalent)
  • Experience leading incident response for high-severity production issues and running effective post-mortems
  • Strong background in automation, infrastructure as code (Terraform, Bicep, or similar), and systematically eliminating manual operational work
  • Experience with Kubernetes container orchestration with production-grade operational experience
  • Ability to automate workflows and build scripts using Python, Bash, PowerShell, or Go
  • Strong communication with the ability to make complex technical issues clear for both engineers and executives
  • Data-driven approach. You use metrics and telemetry to guide decisions, not gut feel
  • You are collaborative cross-functionally and build trust and alignment naturally
  • AI-enhanced observability experience is preferred
  • Experience with AI coding assistants and CI/CD systems (GitHub Actions, Azure DevOps, ArgoCD) with automation capabilities is preferred
  • Knowledge of distributed systems patterns is preferred
  • Exposure to AIOps platforms or using LLMs for operational automation is preferred

Company Overview

  • Paradigm provides a software platform that focuses on the building products industry. It was founded in 1999, and is headquartered in Middleton, Wisconsin, USA, with a workforce of 501-1000 employees. Its website is http://myparadigm.com/.
  • Company H1B Sponsorship

  • Paradigm has a track record of offering H1B sponsorships, with 1 in 2026, 1 in 2025, 4 in 2024, 1 in 2023, 1 in 2022, 4 in 2021, 1 in 2020. Please note that this does not guarantee sponsorship for this specific role.
  • Apply To This Job

    You might like

    [Remote] Senior Software Engineer, Security Agents

    Work from home Full-time role

    [Remote] Business Development Representative - US

    Work from home Full-time role

    [Remote] Site Reliability Engineer

    Work from home Full-time role

    [Remote] Marketing Coordinator (Volunteer)

    Work from home Full-time role

    [Remote] Enterprise Account Executive - Oil&Gas

    Work from home Full-time role

    [Remote] Account Executive- K-12

    Work from home Full-time role

    [Remote] Firm Operations & Client Services Manager

    Work from home Full-time role

    [Remote] Finance and Accounting Recruiter

    Work from home Full-time role

    [Remote] Investment Product Analyst

    Work from home Full-time role

    [Remote] Mobile Software Engineer - AI Trainer

    Work from home Full-time role

    Business Analyst (AI Enablement Team)

    Work from home Full-time role

    Software QA Manager

    Work from home Full-time role

    VP, Finance & Administration (Remote in U.S.)

    Work from home Full-time role

    Part-Time Math Students (Bachelors +)

    Work from home Full-time role

    Regional Collections Specialist

    Work from home Full-time role

    IT Agile Delivery Owner Ld - Randstad Digital

    Work from home Full-time role

    Experienced Data Entry Specialist – Remote Full-Time Opportunity with Competitive Hourly Rate and Comprehensive Benefits at arenaflex

    Work from home Full-time role

    Experienced Digital Chat Response Agent – Remote Customer Support Specialist

    Work from home Full-time role

    Experienced Remote Data Entry Clerk / Typing Specialist – Data Management and Administration

    Work from home Full-time role

    Experienced Data Entry Specialist – Remote Disney Team Member

    Work from home Full-time role