Site Reliability Engineer
Job Description
REQUIREMENTS
- Bachelor’s degree in Computer Science, Engineering, or a related technical field.
- 5+ years of experience in SRE, DevOps, or infrastructure engineering.
- Strong proficiency with UNIX/Linux operating systems and command-line administration.
- Experience with cloud platforms (GCP, AWS, or Azure) and Infrastructure-as-Code (Terraform, CloudFormation).
- Hands-on experience with configuration management (Ansible, Chef, or Puppet).
- Solid understanding of networking fundamentals (TCP/IP, HTTP, DNS).
- Experience with containerization and orchestration (Docker, Kubernetes).
- Proficiency with monitoring and observability tools (Datadog, Prometheus, Grafana).
Preferred
- Experience managing bare-metal server infrastructure and datacenter operations.
- Proficiency with scripting languages such as Python, Go, or Bash.
- Background in high-availability architectures and disaster recovery planning.
- Experience with game server infrastructure or real-time application hosting.
- Previous experience working in a fully remote, distributed environment.
RESPONSIBILITIES
- Design and implement multi-regional resilient infrastructure to handle millions of concurrent sessions.
- Lead hybrid cloud migration strategies, integrating bare-metal resources with cloud services.
- Own on-call rotations and incident response procedures to maintain high availability SLAs.
- Architect monitoring and alerting systems to proactively identify performance bottlenecks.
- Collaborate with development teams to implement infrastructure-as-code and CI/CD pipelines.
- Optimize system performance through capacity planning, load testing, and resource allocation.
- Drive automation initiatives to reduce manual operational overhead.
Are you interested in this position?
Apply by clicking on the “Apply Now” button below!
#CrossChannelJobs #JobSearch
#CareerOpportunities #HiringNow
#Employment #JobOpenings
#JobSeekers
#FacebookLinkedIn