upboundext
Senior Software Engineer [REMOTE]
At a Glance
- Location
- Dallas, Texas, United States
- Work Regime
- remote
- Posted
- 2026-03-09T19:18:38-04:00
Key Requirements
Required Skills
Certifications
- SAFe
Requirements
Have experience operating production cloud services at scale: monitoring, alerting, incident response, post-mortems, and continuous improvement of service reliability.
Have strong debugging skills across distributed systems, including experience with observability tools (Prometheus, Grafana, OpenTelemetry, distributed tracing) and techniques for diagnosing issues in production environments.
Have experience building and operating controllers that interact with the Kubernetes API server, including troubleshooting reconciliation loops, managing API rate limits, and optimizing controller performance.
Are comfortable working directly with customers to understand, reproduce, and resolve complex technical issues in their environments.
Take responsibility and ownership for solving problems even if they are outside your lane, especially during incidents affecting customer workloads.
Demonstrate excellence in your work, constantly trying to improve your skills and the operational posture of the systems you build.
Responsibilities
Actively build and operate Upbound Spaces in production, troubleshooting and resolving issues across multi-tenant SaaS environments, as well as contributing to Upbound's open-source projects, including Crossplane.
Take ownership of building features in high demand by Upbound's customers and deliver new functionality that will delight and amaze our users.
Investigate and debug complex issues in customer environments, including multi-control plane scenarios, resource reconciliation problems, and performance bottlenecks.
Communicate through thoughtful and thorough design documents for new initiatives and detailed post-incident reviews that drive system improvements.
Support the full project lifecycle for highly scalable and reliable services running in a cloud environment – discovery, analysis, architecture, design, review, documentation, building, migration, automation, deployment, production-readiness, and ongoing operational support.
Write and maintain Go code that interfaces with the Kubernetes API, such as operators, controllers, add-ons, etc., with a focus on observability, debuggability, and operational excellence.