Site Reliability Engineer (SRE) - Observability

Job Summary

This position will be part of the IT Process and Tools Team and will focused on designing, implementing, enhancing, and administering enterprise Observability (monitoring, logging, and metrics) tools.  Candidate will be heavily involved in product selection (both commercial and open source); writing automation scripts (Python, Ansible, etc.); designing resilient and performant systems; and ensuring our users have better visibility and insight into the health of their systems.  


Architecture, Design, and Development (40%)

  • Architects and implements monitoring and logging solutions, ensuring resiliency and high availability
  • Makes tooling recommendations based on internal needs and industry trends
  • Develops integrations between observability tools

Operations and Administration (40%)

  • Administers Observability tools to ensure they are available and performant (e.g. Splunk or InfluxDB Administration)
  • Writes scripts to automate repeatable tasks
  • Quickly responds to help tickets submitted by users
  • Provides on-call support to reduce Mean-Time-To-Restoration (MTTR) of services

Internal Consulting (20%)

  • Works with others within the department to help them instrument Observaubility within their processes and tools

Minimum Qualifications

  • A bachelor's degree and 5 years of professional work experience (or a master's degree, or equivalent experience) is required.

Additional Qualifications

  • 2+ years of managing monitoring & logging tools (Prometheus, Splunk, InfluxDB, Nagios)
  • 2+ years of experience with one or more programming/scripting languages (Python, JavaScript, Ansible)
  • 1+ years of experience with Event Pipeline tools (Flink, Spark, Kafka)
  • 1+ years of creating dashboards & reporting (Grafana, Splunk)
  • Experience with CI/CD systems (GitLab, GitHub, Jenkins)
  • Experience with Automation tools (ActiveBatch)
  • Container technologies (Kubernetes, Docker)
  • Proven design experience as well as experience collaborating with others on requirements and design
  • Working knowledge of both Unix/Linux and Windows operating systems as well as application deployments
  • Experience with Agile development processes


Why MathWorks?

It’s the chance to collaborate with bright, passionate people. It’s contributing to software products that make a difference in the world. And it’s being part of a company with an incredible commitment to doing the right thing – for each individual, our customers, and the local community.

MathWorks develops MATLAB and Simulink, the leading technical computing software used by engineers and scientists. The company employs 5000 people in 16 countries, with headquarters in Natick, Massachusetts, U.S.A. MathWorks is privately held and has been profitable every year since its founding in 1984.

Contact us if you need reasonable accommodation because of a disability in order to apply for a position.

The MathWorks, Inc. is an equal opportunity employer. We evaluate qualified applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, veteran status, and other protected characteristics. View The EEO is the Law poster and its supplement.

The pay transparency policy is available here.

MathWorks participates in E-Verify. View the E-Verify posters here.

Apply Now