⭐ Featured

Jobright.ai

Director, Site Reliability Engineering

This role is for a Director of Site Reliability Engineering, focusing on e-commerce platform reliability, managing technology roadmaps and budgets, requiring 10 years of experience in SRE within the e-commerce industry, and expertise in AWS, Azure, or Google Cloud.
🌎 Country
United States
🏝️ Location
Unknown
πŸ“„ Contract
Full-time
πŸͺœ Seniority
Director
πŸ’° Range
Unknown
πŸ’± Currency
$ USD
πŸ’Έ Pay
Unknown
πŸ—“οΈ Discovered
August 18, 2025
πŸ“ Location detailed
San Francisco, CA
rec8YCLok3hsiL7jF
🧠 Skills
#Unknown
Role description
Jobright is an AI-powered career platform that helps job seekers discover the top opportunities in the US. We are NOT a staffing agency. Jobright does not hire directly for these positions. We connect you with verified openings from employers you can trust. Job Summary: Sephora is a leading beauty retailer, and they are seeking a Director of Site Reliability Engineering. In this role, you will lead the SRE team to ensure the reliability and performance of the e-commerce platform while managing the technology roadmap and team budget. Responsibilities: β€’ Develop and manage the technology roadmap for the Site Reliability Engineering (SRE) team. β€’ Lead a team of SRE engineers responsible for ensuring the reliability, performance, and scalability of our e-commerce platform. β€’ Drive continuous improvement efforts to enhance the availability, efficiency, and resiliency of our systems. β€’ Manage the SRE team budget. β€’ Define and enforce best practices and standards for monitoring, incident response, change management, and capacity planning. β€’ Provide team mentorship, guidance, and career development opportunities. β€’ Define and track key performance indicators (KPIs) and metrics to measure the reliability and performance of systems, and report progress to executive management. β€’ Collaborate with security teams to ensure the implementation of robust security measures and practices within the SRE domain. β€’ Establish disaster recovery plans and procedures, including regular testing and validation of backup and recovery mechanisms. β€’ Oversee the design and implementation of monitoring and alerting systems to proactively identify and address performance bottlenecks and issues. β€’ Lead incident post-mortems and root cause analysis. Qualifications: Required: β€’ Bachelor’s or foreign equivalent degree in Computer Science, Engineering or Information Systems. β€’ Ten (10) years of progressively responsible post-baccalaureate experience in the offered position, as a Principal SRE, or in a closely related SRE or Production Application Support leadership role in the ecommerce industry. β€’ Managing highly scalable systems on AWS, Azure, or Google Cloud; β€’ Managing technology budgets and optimizing resource allocation; β€’ DevOps best practices including CI/CD pipelines and configuration management; β€’ Incident management frameworks, response management, and post-mortem analysis; β€’ SRE team building and oversight. Company: Sephora is an online shopping site that offers a range of beauty products such as cosmetics and skincare items. Founded in 1969, headquartered in Neuilly-sur-seine, Ile-de-France, FRA, team size 10001+ employees, currently Late Stage.