We're looking for someone to join our Engineering team at Ferrum. Are you interested in delivering robust services to help hospitals detect and eliminate the millions of medical errors that occur every year? If so, read on. About You You are experienced:
Preferred experience, but not required:
- At least 2 years of relevant support or service-oriented customer facing position;
- Managing cases through the entire support lifecycle, from triage to resolution and writing support content;
- Periodic on-call schedule, where you are clearly communicating complex situations to stakeholders;
- Scripting languages such as BASH and Python, or strongly typed programming languages such as Go;
- Linux operating system internals, filesystems, disk/storage technologies, and networking fundamentals;
- Docker container management and debugging.
You have a deep understanding of Linux environments.
- Working in Kubernetes environments;
- Advanced networking and Linux networking expertise;
- Advanced CentOS and RHEL system administration;
- LFCE or RHCE equivalent certifications.
Your experience with Linux distributions gives you the direction to know what questions to ask and what questions to answer before starting down a root cause analysis session. You understand the storage and networking layers of a Linux environment and know how to resolve anomalous problems. You can handle complexity.
You know how to approach and understand the abstract and large system designs found in production. When a piece of software presents problems, you know how to efficiently cross reference the design diagrams with the logs to triage and root cause any unexpected behavior. You love efficiency and automation.
Your work at Ferrum will have a huge impact on the business. You take pride in developing scalable software that enables the customer to rely on Ferrum for mission critical tasks. Your work allows team members to perform at a high level and scale to new challenges. You should be proficient with BASH and familiar with Python--but Go is even better. You strive to take ownership of a project from ideation to delivery and maintenance.
Your experience provides a framework for you to work independently on several initiatives which will provide significant and measurable value to our customers. Seek opportunities to develop horizontal technologies that benefit multiple teams. Build developer tools to increase efficiency of all teams at Ferrum. You communicate and document everything.
At Ferrum, we are a distributed team. A culture of sharing and documentation allows everyone to work together. We are looking for a practitioner of clear documentation that is well-maintained. And most importantly, you want to make a positive and lasting impact on the world.
You understand that technology has the power to improve people's lives and enrich our society. What You'll Be Doing Implementation management:
Incident response management:
- Apply updates and patches, when applicable, for customer systems;
- Monitoring of site health for deployments in healthcare systems;
- Apply your Linux knowledge to triage and root cause issues discovered in the field;
- Development of software tools to improve internal processes, infrastructure, and system observability;
- Providing reports to developers and stakeholders on product performance and stability metrics.
- Provide the first line of defense for support escalation and remediation for customer sites;
- Work and communicate with IT teams during the incident response lifecycle;
- Continuous documentation and update of incident response SOPs.