Site Reliability Engineer Job at EVONA, San Francisco, CA

dmJDNEMzQ1NuR0dzNnc1blB1QjA5VTltVXc9PQ==
  • EVONA
  • San Francisco, CA

Job Description

Site Reliability Engineer (SRE)

Location : San Francisco Bay Area

Role Overview :

We are seeking a highly skilled Site Reliability Engineer (SRE) to join a dynamic team at a rapidly growing technology company. As an SRE, you will be responsible for ensuring the reliability, scalability, and performance of mission-critical systems, while implementing automation and optimizing cloud infrastructure. This role offers the opportunity to work with cutting-edge AI/ML technologies , leveraging them to solve complex challenges in cloud infrastructure management and performance optimization.

Key Responsibilities :

  • System Reliability & Performance : Design, implement, and maintain scalable systems, ensuring high availability, performance, and disaster recovery across production environments.
  • Automation & Tool Development : Develop automation tools to streamline operations, improve system reliability, and reduce manual interventions.
  • Cloud Infrastructure Management : Create and manage cloud instances (e.g., dev, staging, production) using AWS, GCP, or Azure, optimizing infrastructure performance and cost.
  • Integration of AI/ML Models : Collaborate with engineering teams to integrate machine learning models into production environments, ensuring that these models scale efficiently and perform optimally.
  • Incident Management : Respond to and resolve incidents, minimizing downtime and ensuring quick recovery. Lead post-incident reviews and implement preventive measures.
  • Continuous Improvement : Identify areas of improvement and drive initiatives to enhance system reliability, performance, and security.
  • Security & Compliance : Ensure that infrastructure and applications adhere to security best practices and compliance standards.

Qualifications :

  • Educational Background : Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent experience).
  • Experience : Proven experience as a Site Reliability Engineer or in a similar role within a SaaS environment , managing and optimizing cloud infrastructure (preferably AWS, GCP, or Azure), and familiarity with integrating AI and machine learning technologies.
  • Technical Skills :
  • Proficiency in programming and scripting languages such as Python, Go, or Bash.
  • Experience with containerization and orchestration tools like Docker and Kubernetes.
  • Solid understanding of networking, security , and performance optimization practices.
  • Knowledge of CI/CD pipelines and DevOps practices to ensure smooth development and deployment cycles.
  • Problem-Solving : Strong analytical and problem-solving skills with attention to detail.
  • Collaboration & Communication : Excellent interpersonal skills, with the ability to work collaboratively in cross-functional teams and communicate technical concepts clearly.

Benefits :

  • Competitive Salary : Attractive compensation package, including equity options.
  • Health & Wellness : Comprehensive health, dental, and vision insurance, along with other benefits.
  • Work Environment : A collaborative and innovative work environment within a growing company.
  • Growth Opportunities : Opportunities for career growth, professional development, and a chance to shape the future of the company’s technology and infrastructure.

Job Tags

Similar Jobs

Insight Global

Artificial Intelligence Researcher Job at Insight Global

 ...LangSmith, AWS bedrock, Azure AI Studio, AutoGen, or Crew AI Day to day: Our major distribution client is looking for an AI Research Scientist to join their team. This individual will develop and fine-tune generative AI models for enterprise applications. They... 

Inter-Con Security

EMT/Firefighting Trainer Job at Inter-Con Security

 ...Physically able to perform activities to include running, jumping, defensive tactic movements/actions, firearms instruction. Certified as EMT/Paramedic and/or Firefighter and certification or experience as EMT/FF trainer a plus. Qualifications EMT or Firefighter... 

Souder, Miller & Associates

Survey Tech IV-V Job at Souder, Miller & Associates

 ...SUMMARY: Souder, Miller & Associates is hiring a full time Survey Tech IV-V for Arvada office, Colorado. The Survey Technician V will...  ...'s or Associates degree preferred. Minimum seven years of experience in the surveying industry. Level I Certified Surveyor... 

Ethan Conrad Properties Inc

Project Manager - Moveout Restorations Job at Ethan Conrad Properties Inc

 ...School Diploma or equivalent ~2-5 years' experience in commercial construction, including experience with a commercial general contractor Preferred: ~ Proficient in Microsoft Office and strong computer skills with a variety of software packages; JD Edwards, preferred... 

Adelphi Staffing, LLC

Pediatric Hospitalist Job at Adelphi Staffing, LLC

 ...JOB OVERVIEW Job Title: Physician Pediatric Hospitalist (NICU) Job Type: Locum Tenens Location: Rockville, MD Service Setting: Acute Care Hospital Coverage Type: Clinical Only Coverage Period: July 1, 2025 August 31, 2025 (Potential...