Application - Cloud Engineer
Job title: Application - Cloud Engineer in Dallas, TX at Genuent
Company: Genuent
Job description: Title: Application - Cloud EngineerLocation: remoteDuration: Direct HireSalary: $9,000 - $10,200 monthly
Work Requirements: US Citizenship is Required - Ability to pass enhanced background screen (criminal, financial, drug) for Public Trust clearance.Seeking a skilled and motivated Application Cloud Engineer to join our dynamic team. The ideal candidate will be responsible for maintaining cloud-based applications and infrastructure on AWS. You will work closely with development, operations and security teams to ensure the scalability, performance and security of cloud applications.Responsibilities:
- Provision and manage AWS infrastructure using infrastructure as code (IaC) using tools such as Terraform and CloudFormation
- Monitor and troubleshoot production systems using AWS CloudWatch and other observability tools
- Collaborate with developers to containerize and deploy applications using ECS and Lambda
- Deploy applications across multiple environments (dev, staging, prod) and ensure consistency and stability
- Monitor deployments and system health using CloudWatch and other tools
- Implement rollback strategies and manage version control during deployments
- Troubleshoot and resolve deployment issues and improve pipeline performance and reliability
- Proficient with Python, Bash, YAML/JSON, Node.js, Lambda functions
- Perform daily health checks using AWS CLI or scheduled Lambda scripts to check health and log/report results
- Document deployment processes and infrastructure architecture
- Familiarity with image registries like Amazon ECR and CI/CD pipelines for container deployment
- Collaborate with development team and DevOps teams to ensure applications are stateless and fault-tolerant
- Implement enhancements to containerized environments on ECS, focusing on scalability, performance and observability
- Enhance container orchestration strategies, including auto-scaling, rolling deployments and upgrades
- Support feature branch testing, merge request validation and artifact promotion workflows
- Ensure pipeline security and compliance through automated code scanning and approval gates
- Responsible for remediation of OS-level, container and dependent vulnerabilities
- Orchestrate failover and restoration of ECS/ EKS services, Lambda functions, databases and other infrastructure components
- Test and document regional failover playbooks and recovery runbooks
- Ensure compliance with RTO (Recovery Time Objective) and RPO (Recovery Point Objective) requirements
- Participate in on-call rotations to support 24/7 production systems and respond to incidents as they arise
- Diagnose and resolve production issues related to cloud services, container orchestration, databases and CI/CD pipelines
- Follow and improve incident response playbooks, escalation procedures and communication workflows
- Automate common operational tasks and improve alert accuracy to reduce on-call fatigue
- Log incidents, changes, and operational metrics in tracking system
- BA/BS in IT, Computer Science or related field (or equivalent work experience may be accepted in lieu of the degree)
- 5+ years of IT experience. 2+ years of hands-on experience with AWS and cloud-based deployment strategies
- Proficient in scripting languages like Python, Bash and Node.js.
- Hands-on experience with CI/CD tools (GibHub, GitLab, Kubernettes, DevOps, CI)
- Knowledge of disaster recovery planning and implementation
- AWS or relevant Cloud certifications (AWS DevOps Engineer, Solutions Architect Associate)
- Solid understanding of cloud architecture principles, autoscaling strategies and load balancing
- Proficient with monitoring, alerting and logging tools
- Strong written and verbal communication skills for technical and non-technical stakeholders
- Excellent analytical and problem-solving skills
- Must be a US Citizen.
- Must be able to obtain and maintain a Public Trust clearance
- Familiarity with container orchestration (Docker, ECS, Kubernetes)
- Knowledge of ITIL practice or incident management frameworks