Site Reliability Engineer

Last updated one month ago
Location:Redmond, Washington
Job Type:Full Time

As the world’s largest software company, Microsoft holds itself to high standards ensuring that our customers’ expectations are met. The role of the Site Reliability Engineer (SRE) is to help the Microsoft cloud security group provide the high level of availability, performance, cost, and supportability that our customers should and do expect across their Azure cloud environments. You will be expected to confront real-world, large-scale challenges across some of the world’s most complex cloud deployments. We are passionate about enabling customers and team members to deliver agile, reliable, high-performance solutions at scale.
We are looking for a team-player to help us optimize and protect the software and systems behind our internal and customer offerings keeping an ever-watchful eye on their reliability, latency, performance, and capacity.

Responsibilities

  • Deploy and maintain our production infrastructure hosted on Azure
  • Analyze complex system behavior, performance, and application issues
  • Analyze and plan capacity for our cloud services
  • Apply modern engineering practices to drive down operational overhead through automation and system design
  • Promote security excellence across a broad set of internal and external customers
  • Define and create standard operating procedures for support teams
  • Ensure all infrastructure and application alerts are “actionable” alerts and/or self-healing automation
  • Work closely with the service development team - offering education and guidance on integration, support, and monitoring across the toolset
  • Serve as the Tier 3 escalation point for support responsible for troubleshooting, as well as mentoring and coaching others
  • Demonstrate complex troubleshooting skills, deep knowledge of the services running on the infrastructure, and work with engineers and vendors to resolve issues
  • Live Site Management – as an SRE you will play a crucial role in a global team driving huge-scale live sites 24/7 and gaining deep understanding of availability, performance, and security
  • Automate processes
  • Conduct periodic on-call duties
  • Work cross-team with Azure

Qualifications

Required Qualifications:

  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
  • Experience in cloud environments (Azure/AWS/GCP)
  • Proven technical troubleshooting and performance tuning experience
  • Experience with distributed systems, networking, hardware, logistics and operations, or capacity planning
  • Strong written and oral communication skills required
  • Ability to contribute to multiple projects/demands simultaneously
  • 2+ years of experience with Linux system administration.
  • 3+ years of experience handling critical production incidents
  • 2+ years of software development experience or BS Degree in Computer Science, Electrical & Computer Engineering or Mathematics or equivalent experience

Preferred Qualifications:

  • 3+ years of experience with a monitoring system (Pingdom, Datadog,Splunk, Grafana, Azure Monitoring)
  • Experience defining and measuring internal/customer facing OLA/SLAs
  • 3+ years of service automation using scripting tools: Python/PowerShell/Bash (PowerShell preferred)

#AzureSecOpen

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form.

Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.