SRE + Automation
Job Description:
Responsibilities:
This list is intended to reflect the current job but there may be additional essential functions (and certainly non-essential job functions) that are not referenced. Management will modify the job or require other tasks be performed whenever it is deemed appropriate to do so observing of course any legal obligations including any collective bargaining obligations.
Ensure key stakeholders product owners and platform owners are informed of reliability concerns and their potential impact on the customers experience.
Design code test and deliver solutions to automate manual operation (i.e. TOIL).
Participate in operations support and on-call rotation shifts (could include weekends and holidays) for SRE supported systems and products with a focus on implementing long-term solutions for any problems identified.;
Collaborate with stakeholders such as product and platform owners to define service level objectives (SLOs) and service-level indicators (SLIs) for system operations focused on the critical features of the customers journey and experience.
Track and manage reliability performance against agreed SLOs in partnership with IT monitoring teams or other stakeholders and ensure systems continue to meet SLOs over time.
Provide expert knowledge on reliability approaches to ensure our organization achieves its goals and roadmap for reliability.
Champion reliability being treated as a feature in products and platforms and promote the concept across all phases of the software development life cycle.
Create dashboards and reports to communicate key metrics to product owners and key stakeholders.
Contribute to documentation and runbooks for owned applications based on operational experience user feedback and application changes
Qualifications:
Minimum Qualifications Education & Prior Job Experience
Bachelors degree in Computer Science Computer Engineering Technology Information Systems (CIS/MIS) Engineering or related technical discipline or equivalent experience/training
5 years of experience designing developing and implementing large-scale solutions in production environments
Preferred Qualifications Education & Prior Job Experience
Masters degree in Computer Science Computer Engineering Technology Information Systems (CIS/MIS) Engineering or related technical discipline or equivalent experience/training
Airline Industry experience
Top 3 required skills:
1. Able to Design develop and maintain automated test frameworks using Cypress/Playwright for web and API testing.
2. With Experience in testing applications built with JavaScript TypeScript and GraphQL.
3. Able to Build and maintain integration with CI/CD pipelines to ensure reliable automated testing using Azure Dev Ops and GitHub actions.
We will consider junior developers who can demonstrate passion for development and processes.
Nice to Have Skills and Experience:
Dynatrace (APM/monitoring)
Mezmo (LogDNA) (log aggregation)
BigPanda (incident intelligence)
Nucleus (security & vulnerability management)
Understanding of production observability and incident management concepts.
Excited to learn grow their SRE skills and take ownership across both testing and reliability domains.
A passion for improving processes and building reliable systems.
Proven ability to work independently and take initiative with minimal guidance.
Strong background in quality engineering with a solid understanding of automation best practices.
Familiarity with SRE concepts such as monitoring alerting incident response.
Design develop and maintain automated test frameworks using Cypress/Playwright for web and API testing.
Experience testing applications built with JavaScript TypeScript and GraphQL.
Familiarity with navigating and managing resources in Azure cloud and Kubernetes environments.
Define and execute comprehensive test strategies for new features and services.
Implement and manage CI/CD workflows using GitHub Actions or other GitHub-integrated tools.
Build and maintain integration with CI/CD pipelines to ensure reliable automated testing using Azure Dev Ops and GitHub actions.
Conduct regression performance and security testing.
Collaborate with developers and product managers to ensure high-quality releases.
Participate in the SRE teams daily operations including system monitoring alerting and incident response.
Implement post-deployment validation health checks and release safety mechanisms.
Help define and monitor SLAs SLOs and error budgets.
Contribute to reliability tooling observability improvements and performance diagnostics.
Participate in blameless postmortems and propose solutions to improve system stability.
Recommended Jobs
Lead Cleaner
**Overview** **Position Summary Details** Troubleshoot, repair, maintain, and install HVAC and related equipment as assigned. Assume day to day coordination of specific agreements or projects as assig…
General Superintendent - Electrical (Industrial)
NOW HIRING: General Superintendent â Electrical (Industrial) Location: Phoenix, AZ Pay: $110,000â$135,000/year D.O.E. Industry: Industrial Electrical Construction Company: Great Basin …
Field Canvassing Supervisor
Canvassing Supervisor – $20/hr Base Pay + Bonuses Location: Phoenix, AZ and surrounding areas Ready to take your sales leadership skills to the next level? Join a fast-growing team at Optum …
Technical Sales Engineer-Southeastern US
Technical Sales Engineer-Southeastern US Location: Remote, US Job Category: Sales Shift: Shift 1 Full Time / Part Time: Full-Time Job Level: Individual Contributor Approximate Travel: 50% Job Descript…
Pediatric Certified Occupational Therapy Assistant (COTA)
IMMEDIATE NEED: Pediatric Occupational Therapy Assistant (COTA) Opportunity in Coolidge, AZ! Great opportunity for COTAs who live in South Chandler, South Gilbert, Queen Creek, or San Tan Valley! T…
Billing Clerk
Chapman Automotive Group is proud to be one of the leading automotive groups in Arizona and Nevada, committed to delivering the best vehicle ownership experience through extraordinary customer service…
Power Platform Developer (Remote)
**Overview** GovCIO is currently hiring for a Microsoft Power Platform Developer to support workflow enhancement, automation engineering, data configuration, data analytics, and reporting activities a…
Movers/Helpers Wanted
**ONSITE JOB OFFERS!!!** Hiring Helpers We make it fast and easy to start working!! Pre-qualify within minutes!! Helper Pay: Paid Weekly • $14 to $16 per hour (Based on Experience) • TIPS E…
Endocrinologist
JOB OVERVIEW • Job Title: Endocrinologist • Job Type: Permanent – Full-Time • Location: Goodyear, AZ • Service Setting: Outpatient – Clinic Only • Schedule: Monday – Friday, 8a – 5p • On-…
Director, Global Revenue Recognition
**Job Description** The Revenue Recognition Technical Program Office is a diverse and high-performing team of motivated professionals who apply deep technical accounting and finance expertise to help …