Pro2Serve is an Equal Opportunity Employer (Minorities/Females/Disabled/Veterans).  To read more about this, view the EEO is the Law poster and this EEO is the Law Poster Supplement

Pay Transparency Statement

Start Over with Job Search

Returning Applicant?  Login Now

Senior Site Reliability Engineer
Job Code:2021-P2S-060
Location:Oak Ridge, TN
Division:National Security - 1.4.3
Duration:Direct Hire
  

Job Responsibilities:

Position:   Senior Site Reliability Engineer                          

Division:   National Security Program

Duration:  Direct Hire

Location:  Oak Ridge, TN

 

Company Description: 

Professional Project Services Inc. has upcoming opportunity for Senior Site Reliability Engineer (Direct Hire) positions at our offices in Oak Ridge, TN.  Please submit resumes via the web page link below.  

 

Professional Project Services, Inc. (Pro2Serve®) is a nationally-recognized technical and engineering services firm dedicated to providing critical infrastructure engineering services in support of our Nation’s security. Using a disciplined systems engineering approach that is supported by an innovative software toolset, Pro2Serve provides solutions to improve the effectiveness and efficiency of our government and private clients. We support the defense, energy, and science markets through responsive, cost-effective execution of critical security, facilities and infrastructure, nuclear defense and nonproliferation, and environmental projects.

 

Job Description

Pro2Serve is seeking highly qualified individuals to play a key role in improving the security, performance, and reliability of the NCCS computing infrastructure. The NCCS is a leadership computing facility providing high performance computing resources for tackling scientific grand challenges.

 

The Team

 

The Platforms group is tasked with architecting and running our Kubernetes platform called Slate which provides a service to NCCS users and staff to develop, manage, and deliver their own applications that integrate with NCCS HPC resources.

 

We strive to provide the best Kubernetes service for both our internal staff as well as our scientific users. We achieve this goal in part by dogfooding and we use Kubernetes to run all of our own internal services we support. We have great opportunities to work with other staff helping them develop their applications on the platform as well as working with our outstanding scientific community as we bring Kubernetes to the HPC world.

 

We are at the intersection of container orchestration and HPC, come help us build the bridge.

 

About you

 

We are looking for an experienced systems engineer who can code and focus on customer success. You handle infrastructure with code because automation lets you focus on the more difficult and rewarding problems. You love collaboration with others and coming up with the best solution to the problem. You enjoy and can pick up a new technology quickly. You love CI/CD and GitOps. You probably have production experience with Kubernetes and Golang. You may have a GitHub account with cool projects. You may have technical leadership experience.

 

Tools we use: Kubernetes, OpenShift, Helm, Prometheus, RHEL, GitLab CI, Terraform, Puppet, Python, Golang

 

Responsibilities

  • Participate in an on-call rotation for off-hours support
  • Keeping the Kubernetes platform reliable, available and fast
  • Architecting solutions to problems that improve the reliability, scalability, performance and efficiency of our services
  • Respond to, investigate, and fix service issues all the way from bare metal through the OS to the application code
  • Design, build, and maintain the infrastructure we need to support the NCCS
  • Work with our users to help them use Kubernetes
  • Write awesome documentation

 

Job Requirements

A Bachelor’s degree in a scientific field and 8+ years of relevant experience or equivalent experience.

At least five years of experience as an SRE/Sysadmin/Systems engineer

 

Preferred Qualifications

Experience with Kubernetes, OpenShift, Helm, Prometheus, RHEL, Puppet, Python, Golang

 

 

Duration:
Direct Hire

 

Federal Government Clearance:

This position may require the ability to obtain a government clearance. This position may require reviews and test for absence of any illegal drugs along with a background investigation by the Federal government in order to obtain an access authorization prior to employment, and may require subsequent reinvestigations may be required.

 

EEO Employer:

Affirmative Action Employer—M/F/Vet/Disab/LGBT

 

Benefits

Pro2Serve’s benefits package was carefully designed to meet the needs of our employees and their families.

 

These benefits include:

  • Major Medical Plan with Prescription Card, Dental Plan, Vision, and Disability Insurance
  • Retirement Plan 401(k)
  • Employee Stock Ownership Program (ESOP)
  • Comprehensive Leave
  • Holidays

 

 

Pay Rate:
Please submit salary or hourly rate requirements along with resume or in a cover letter.

 

Job location:

Oak Ridge, TN area

Please submit resumes via the web page link.

 

If you meet the above requirements/qualifications, please click the Apply Now button to submit your resume to be considered for this position, as well as added to our national database.  We look forward to talking with candidates who have the requisite skills and experience level.