Job Description
Job Overview
Cloudflare is seeking talented Systems Reliability Engineers (SRE) to enhance and operate their Edge platform, which serves a global network across 320 cities in over 120 countries. The ideal candidates will share a passion for automation, scalability, and operational excellence while working in a collaborative environment that supports continuous learning and diversity. This role involves building tools that improve service availability and performance while managing a portfolio of applications and services.
Technical Requirements
Required Skills
- • Linux systems experience
- • Software development skills in Go or Python
- • Understanding of distributed software systems and large scale system design tradeoffs
- • Intermediate experience of common network protocols like DNS and HTTP
Preferred Skills
- • Experience with Linux kernel and software packaging
- • Performance analysis and debugging
- • Configuration management systems like Saltstack, Chef, Puppet or Ansible
- • Load balancing and reverse proxies such as Nginx, Varnish, HAProxy, Squid or Apache
- • SQL databases
- • Time series databases such as OpenTSDB, Graphite, Prometheus or Grafana
Experience Level
3 years experience in an SRE role or a similar function
Responsibilities
- • Build and operate the Edge platform
- • Support services in a follow the sun model
- • Focus on the immediate state and functionality of the Cloudflare platform
- • Leverage monitoring, alerting and diagnostics tools while developing platform capabilities
- • Nurture an automate everything approach to improve system resilience and scalability
Additional Information
- Location
-
London, UK
- Type
-
Hybrid
- Compensation
-
Not specified