Dev Ops Engineer

Remedy Robotics

Remedy Robotics

Software Engineering, Operations

San Francisco, CA, USA

Posted on May 20, 2026

Location

San Francisco

Employment Type

Full time

Department

Engineering

About Remedy Robotics

Cardiovascular disease is the #1 cause of morbidity and mortality in the world. Much of this could be prevented with better access to specialist care. Take stroke as an example: any delay in treatment can lead to permanent disability or death. However, due to a lack of specialist surgeons, the most effective intervention can only be performed in 2% of US hospitals. For patients who present to one of the 98% of hospitals that do not offer the surgery, treatment is either significantly delayed or not offered at all because timely transfer is not feasible.

Our mission is to bring state-of-the-art vascular intervention to anyone, anytime, regardless of their location. Our team of medical clinicians, roboticists, and machine learning experts are working to bridge this gap by building the world’s first remotely-operated, semi-autonomous endovascular surgical robot.

We’ve already done what nobody else could—using our system, doctors from around the world were able to remotely perform this procedure from as far as 8000 miles away. We now need your help to bring this technology out of the laboratory and into hospitals everywhere.

The Role

You'll own the developer platform for a small, multi-disciplinary engineering team building an autonomous surgical robot. The stack is primarily Python (ML, orchestration, much of the application code) with some C++ for performance-critical robot control and TypeScript for surgical UIs, running across on-prem lab compute, GPU workstations, and the cloud. You'll work directly with our software, ML, hardware, and data teams to make the development cycle fast and the deployments boring.

This is a team-of-one role. You'll set the platform direction, build it, and operate it.

You Will

  • Build and operate CI/CD covering our Python codebase (the bulk of the work), C++ robot control code, ML training pipelines, and TypeScript UIs — each with different testing and deploy patterns

  • Own our lab compute infrastructure: the server-room PCs running Ubuntu, GPU workstations, and the supporting network

  • Improve developer experience across the org: local dev environments, package management, build times, test reliability

  • Integrate hardware-in-the-loop testing into the CI flow where it makes sense (the robot lives in the lab and needs to participate in regression testing)

  • Standardize and harden security across on-prem and cloud

  • Work with the ML team on GPU pipelines, experiment tracking, and model deployment

  • Manage cloud infrastructure for training, data, and remote services

  • Collaborate with the engineers to unlock the best tools and processes for the team

You Have

  • 5+ years of DevOps, platform, or infrastructure engineering on non-trivial systems

  • Operated CI/CD for a polyglot codebase — you've debugged GitHub Actions runners, written nontrivial workflows, and understand the tradeoffs of self-hosted vs hosted runners

  • Strong Linux administration skills and comfort with infrastructure-as-code

  • Strong Python fluency — most of our code is Python and you'll be living in it daily; can read and contribute to C++ and TypeScript when needed

  • Cloud experience (AWS or equivalent)

  • Advanced fluency with coding agents (Claude Code, Cursor, or equivalents) — you use them as a daily force multiplier

  • Clear communication and a service mindset — your job is to make other engineers faster

Nice to Haves

  • Robotics, embedded, or scientific computing background — you've dealt with hardware that needs to be on for tests to pass

  • ML pipeline tooling experience (SkyPilot, MetaFlow, Ray, or similar)

  • Self-hosted GitHub Actions runners at scale

  • Python monorepo tooling (uv, Poetry, Bazel) and C++ packaging (Conan, vcpkg)

  • Real-time Linux experience

  • Audit-friendly build infrastructure — signed builds, traceable artifacts, reproducible builds (relevant for IEC 62304 down the road)

  • Prior medical device or regulated industry experience

  • Docker and Kubernetes familiarity (we don't run K8s today but may grow into it)