What is the opportunity?
We are looking to expand our Ventures team at RBC. If you are looking for an exciting, high growth opportunity with a leading financial institution that is accelerating cloud native development this could be the job for you. Are you looking for a chance to make a difference? Are you someone who embraces change?
We are looking for a Site Reliability Engineering (SRE) Lead who exemplifies the attributes of a leader, mentor and decision maker. We are currently building out our SRE team, with the goal being to provide mentorship and tooling to the Ventures application teams to manage the health, security, and availability of their applications in production. We work with other teams to provide guidance throughout the lifecycle of building, deploying, and operating the application. What will you do?
What do you need to succeed? Must-have
- Help design and drive the implementation of application health monitoring and alerting.
- Manage SRE initiatives and drive them to completion while clearly communicating status to the rest of the organization.
- Provide leadership and prioritization to the SRE team and help make the key trade-offs required to keep the team working most effectively.
- Take a critical leadership role in our incident management process, ensuring a timely resolution and proper actions are taken in response.
- Give guidance to application teams to improve their application monitoring and alerting, to stop issues before they impact customers
- Raise the bar on incident response management
- Define and implement change management procedure
- Help design CICD pipeline, using best practices around automation, pushing changes that improve reliability and velocity.
- Provide mentorship and training to Ventures Application team on CICD pipeline and processes; drive education and knowledge transfer of design patterns, technical practices, and relevant technologies and tooling.
- Work with security experts to ensure security best practices are built into any scripts, pipelines, and tooling created.
Nice To Have
- 4+ years of experience as an SRE supporting multiple applications.
- 5+ years of experience as a manager in Operation, DevOps, SRE, or Software Engineering.
- Wiliness to be a hands on contributor in day to day activities
- Experience in defining incident and change management process and procedures
- Experience with the operational aspects of software systems such as monitoring, centralized logging, and alerting.
- Demonstrable experience of developing a high performance culture / team. Experience of defining KPI's / SLA's and managing teams to excel at these
- Experience in public cloud (we have a large presence in AWS)
- You have used Ansible, Puppet, Chef or another config management suite, know where it's broken, and open to trying new alternatives
What's in it for you?
- AWS certification
- Experience in Sumo Logic, New Relic and Pager Duty
- Background or Knowledge of n-tier application security practices
We thrive on the challenge to be our best, progressive thinking to keep growing, and working together to deliver trusted advice to help our clients thrive and communities prosper. We care about each other, reaching our potential, making a difference to our communities, and achieving success that is mutual.
- A comprehensive Total Rewards Program including bonuses and flexible benefits, competitive compensation, commissions, and stock where applicable
- Leaders who support your development through coaching and managing opportunities
- Ability to make a difference and lasting impact
- Work in a dynamic, collaborative, progressive, and highperforming team
- A worldclass training program in financial services
SFDG RBCVentures VenturesTO DIG