Site Reliability Engineering Manager
: Job Details :


Site Reliability Engineering Manager

The Cypress Group

Location: New York,NY, USA

Date: 2024-06-01T05:31:53Z

Job Description:

Job Title: Manager of Site Reliability Engineering

Company Overview:

Join a leading global payment processing firm revolutionizing the industry through cutting-edge technology and a commitment to excellence. We are dedicated to providing seamless, secure, and efficient payment solutions to clients worldwide. As we continue to expand and innovate, we are seeking a talented Manager of Site Reliability Engineering to lead our team and drive our technical operations forward.

Position Overview:

As the Manager of Site Reliability Engineering, you will play a pivotal role in ensuring the reliability, scalability, and performance of our high-transaction payment processing systems. Leading a team of three experienced engineers initially, with a focus on growing the team to up to ten members, you will oversee incident management, disaster recovery, and technical guidance for our evolving infrastructure. With our environment transitioning from a monolithic PHP architecture to a .NET microservices framework hosted on AWS (with openness to other cloud platforms), your expertise in cloud-native solutions and hands-on leadership will be instrumental in our success. This role offers the opportunity for remote work with a preference for candidates located on the East Coast to facilitate communication with our offshore teams.

Responsibilities:

  • Lead and mentor a team of Site Reliability Engineers, fostering a culture of collaboration, innovation, and continuous learning.
  • Oversee incident management processes, ensuring rapid resolution and minimal disruption to operations.
  • Develop and implement disaster recovery plans to safeguard against potential disruptions and minimize downtime.
  • Provide technical leadership and guidance in the transition from a monolithic PHP environment to a .NET microservices architecture.
  • Collaborate closely with development teams to optimize system performance, scalability, and reliability.
  • Evaluate and implement cloud-native solutions on AWS, driving efficiency and cost-effectiveness.
  • Conduct interviews and actively participate in the hiring process to build and grow a talented engineering team.
  • Continuously monitor system health, performance metrics, and security protocols, implementing improvements as needed.
  • Stay abreast of industry trends, emerging technologies, and best practices in Site Reliability Engineering.

Requirements:

  • Bachelor's degree in Computer Science, Engineering, or related field; advanced degree preferred.
  • Proven experience in a leadership role within Site Reliability Engineering, preferably in a high-transaction environment.
  • Extensive hands-on experience with cloud platforms, particularly AWS; certification(s) a plus.
  • Strong proficiency in .NET development and microservices architecture; familiarity with PHP advantageous.
  • Expertise in incident management, disaster recovery planning, and technical troubleshooting.
  • Excellent communication skills with the ability to effectively collaborate with cross-functional teams and communicate with offshore teams.
  • Demonstrated experience in hiring, mentoring, and developing engineering talent.
  • A passion for innovation, problem-solving, and driving continuous improvement.
  • Highly organized, detail-oriented, and able to thrive in a fast-paced, dynamic environment.

Location: Remote (East Coast highly preferred)

Compensation: Up to $245,000 annually + bonus

Join us in shaping the future of global payment processing. If you're a dynamic leader with a passion for technology and a drive for excellence, we want to hear from you! Apply now to embark on an exciting journey with a forward-thinking company at the forefront of innovation.

Apply Now!

Similar Jobs (0)