The Senior Performance and Scalability Engineer is responsible for ensuring that production systems are scaled to support new products and traffic growth.
This is a direct hire role and our client is open to having candidates sit in either Minneapolis, MN or Raleigh, NC.
* Define best practices related to dependency handling, scalability, and performance monitoring. Collaborate with the Software Development teams to ensure best practices are part of the design.
* Develop and maintain the test framework to support post-deployment and performance testing in production environment. Collaborate with SREs and Software Engineers to develop tests for all products.
* Design and implement dashboards and alerts to assist with systems scaling and performance evaluation.
* Proactively address performance issues and scalability concerns.
* Serve as a subject matter expert on all matters related to the service operations and a first level of escalation for any issues. Troubleshoot and provide root cause analysis for issues spanning code, network, database and systems components.
* Assist SREs in development of product specific scalability requirements.
* Define infrastructure requirements and architecture. Ensure the infrastructure meets performance and capacity requirements.
* Understand application dependencies, review dependency handling and health checks.
Evaluate whether the dependency reliability is adequate to meet SLOs
* Provide technical leadership and mentoring to other members of Operations Services team
* Participate in on-call rotation
* Bachelor degree in computer science, information sciences or related field or equivalent experience
* Ability to analyze network traces and troubleshoot application performance problems
* Ability to conceptualize a distributed service, it's dependencies and the transactional flow
* Experience with Unix/Linux and Windows operating system administration and networking architecture
* Experience providing technical leadership and architectural guidance to Software Development teams.
* 7+ years proven development skills in one or more programming languages: Python, Java, Go, Ruby, shell scripting or similar
* 7+ years of software development, automation or infrastructure as code experience
* Cloud infrastructure as code experience, e.g., Terraform, CloudFormation
* Experience with configuration management tools Ansible, Chef, Puppet, Salt, and application schedulers like Kubernetes, Nomad, DockerSwam.
* Experience monitoring/supporting Kafka, IBM MQ.
* Experience querying SQL and No SQL databases. Familiarity with Oracle, Hadoop or Cassandra database architecture.
* Experience building CI/CD tools (Jenkins, Teamcity) for a production application in an enterprise environment
* Demonstrated ability to triage processing bottlenecks
* Experience with monitoring systems: Influx, Splunk, Zenoss, AppDynamics or similar
* Experience troubleshooting certificate issues and PKI infrastructure
* Experience evaluating and implementing new technical solutions
We are an equal opportunity employer and make hiring decisions based on merit. Recruitment, hiring, training, and job assignments are made without regard to race, color, national origin, age, ancestry, religion, sex, sexual orientation, gender identity, gender expression, marital status, disability, or any other protected classification. We consider all qualified applicants, including those with criminal histories, in a manner consistent with state and local laws, including the City of Los Angeles' Fair Chance Initiative for Hiring Ordinance.