Job Title: Manager of Site Reliability Engineering
Company Overview:
Join a leading global payment processing firm revolutionizing the industry through cutting-edge technology and a commitment to excellence. We are dedicated to providing seamless, secure, and efficient payment solutions to clients worldwide. As we continue to expand and innovate, we are seeking a talented Manager of Site Reliability Engineering to lead our team and drive our technical operations forward.
Position Overview:
As the Manager of Site Reliability Engineering, you will play a pivotal role in ensuring the reliability, scalability, and performance of our high-transaction payment processing systems. Leading a team of three experienced engineers initially, with a focus on growing the team to up to ten members, you will oversee incident management, disaster recovery, and technical guidance for our evolving infrastructure. With our environment transitioning from a monolithic PHP architecture to a .NET microservices framework hosted on AWS (with openness to other cloud platforms), your expertise in cloud-native solutions and hands-on leadership will be instrumental in our success. This role offers the opportunity for remote work with a preference for candidates located on the East Coast to facilitate communication with our offshore teams.
Responsibilities:
- Lead and mentor a team of Site Reliability Engineers, fostering a culture of collaboration, innovation, and continuous learning.
- Oversee incident management processes, ensuring rapid resolution and minimal disruption to operations.
- Develop and implement disaster recovery plans to safeguard against potential disruptions and minimize downtime.
- Provide technical leadership and guidance in the transition from a monolithic PHP environment to a .NET microservices architecture.
- Collaborate closely with development teams to optimize system performance, scalability, and reliability.
- Evaluate and implement cloud-native solutions on AWS, driving efficiency and cost-effectiveness.
- Conduct interviews and actively participate in the hiring process to build and grow a talented engineering team.
- Continuously monitor system health, performance metrics, and security protocols, implementing improvements as needed.
- Stay abreast of industry trends, emerging technologies, and best practices in Site Reliability Engineering.
Requirements:
- Bachelor's degree in Computer Science, Engineering, or related field; advanced degree preferred.
- Proven experience in a leadership role within Site Reliability Engineering, preferably in a high-transaction environment.
- Extensive hands-on experience with cloud platforms, particularly AWS; certification(s) a plus.
- Strong proficiency in .NET development and microservices architecture; familiarity with PHP advantageous.
- Expertise in incident management, disaster recovery planning, and technical troubleshooting.
- Excellent communication skills with the ability to effectively collaborate with cross-functional teams and communicate with offshore teams.
- Demonstrated experience in hiring, mentoring, and developing engineering talent.
- A passion for innovation, problem-solving, and driving continuous improvement.
- Highly organized, detail-oriented, and able to thrive in a fast-paced, dynamic environment.
Location: Remote (East Coast highly preferred)
Compensation: Up to $245,000 annually + bonus
Join us in shaping the future of global payment processing. If you're a dynamic leader with a passion for technology and a drive for excellence, we want to hear from you! Apply now to embark on an exciting journey with a forward-thinking company at the forefront of innovation.