Dieses Stellenangebot ist nicht mehr verfügbar
Senior Linux HPC Systems Administrator
- Boston, Massachusetts, United States
- Boston, Massachusetts, United States
Über
Role: Senior Linux HPC Systems Administrator/Engineer
Location: Boston, MA
Overview
Experienced Senior Linux HPC Systems Administrator/Engineer with minimum 10 years of enterprise IT experience to manage and support our critical Linux-based infrastructure
This role is critical for managing and supporting our advanced computing environments, which are pivotal to scientific research and high-performance computing (HPC) initiatives.
The position requires hands-on expertise with high-end workstation hardware and scientific applications, as well as a strong background in HPC techniques, including clustering and workload management with tools like Slurm.
The ideal candidate will be proficient in RedHat Enterprise Linux (RHEL 8 & 9) and have experience with scientific and high-performance computing environments and will also have excellent stakeholder relationship skills and the ability to communicate complex technical concepts effectively to various stakeholders, ensuring our scientists receive top-tier in-person support onsite.
Key Responsibilities
- Enterprise Linux Administration:
- Administer, configure, and maintain RHEL environments (specifically RHEL 8 & 9) ensuring stability, performance, and security.
- Provide hands-on support with high-end workstation hardware for scientists, promptly addressing hardware and software issues.
- Scientific and HPC Support:
- Offer technical support to scientific users, bridging the gap between research demands and IT infrastructure.
- Leverage any scientific computing experience to optimize system performance and manage specialized applications.
- Assist with management of high-performance compute resources, including experience with Slurm, clustering, and related HPC technologies.
- Collaboration and Stakeholder Management:
- Work closely with other technical teams and stakeholders to align IT services with organizational needs.
- Build and maintain strong stakeholder relationships, communicating complex technical concepts.
- Provide in-person support onsite to ensure effective resolution of issues and a high level of customer satisfaction.
- Service Management and Process Improvement:
- Utilize ServiceNow for tracking incidents, managing change requests, and ensuring timely resolution of service tickets.
- Implement and follow IT best practices for incident management, performance monitoring, and network troubleshooting.
- Additional Technical Duties:
- Manage SSL certificates and configure web servers as needed.
- Monitor and troubleshoot system performance issues, including understanding the impact of GPUs, networking, and other hardware components.
- Handle vendor relationships effectively, coordinating with external partners to resolve issues and optimize service delivery.
- Maintain familiarity with MacOS systems to provide assistance when necessary.
Required Qualifications
- Technical Expertise:
- Minimum 10 years of enterprise IT experience with extensive hands-on expertise in RedHat Enterprise Linux (RHEL), specifically RHEL 8 & 9.
- Proven experience with high-end workstation hardware setups and scientific application support.
- Demonstrated knowledge of scientific computing and experience in high performance compute environments, including experience with Slurm and clustering, is highly desirable.
- Strong troubleshooting skills for both hardware and software issues.
- Interpersonal Skills:
- Excellent communication skills with a proven ability to engage and build relationships with stakeholders at various levels.
- Experience working collaboratively with other technical teams to resolve complex problems and drive operational improvements.
- Strong stakeholder relationship building skills and the ability to manage vendor relationships effectively.
- Additional Desirable Skills:
- Working knowledge of ServiceNow and its application in incident and service management.
- Familiarity with networking concepts, performance monitoring tools, and GPU technologies.
- Any experience with scientific applications will be a significant advantage.
- Exposure to MacOS environments is useful but not essential.
- Onsite Requirement:
- Must be able to work onsite to provide in-person technical support to scientists and ensure optimal system performance.
Mandatory Skills
(Top 5 Keywords or skills)
Skill Proficiency
Years Of Experience
Basic Knowledge
Medium
Expert
RedHat Enterprise Linux (RHEL 8 & 9) Administration
10
expert
High-Performance Computing (HPC) Techniques
5
expert
Clustering
3
medium
Troubleshooting and Performance Optimization
10
expert
Salary And Other Compensation
Applications will be accepted until 11/14/25.
- Please note, this role is not able to offer visa
Sprachkenntnisse
- English
Dieses Stellenangebot wurde von einem unserer Partner veröffentlicht. Sie können das Originalangebot einsehen hier.