Jun 12, 2026

AI Infrastructure Engineer

Country

United States

Location type

Remote

Salary

$170,000 – $210,000 base · Annual

About the Company

Fast-growing AI company supported by NVIDIA, enabling data centers to dynamically manage power and maximize compute capacity from existing energy infrastructure.

About the Role

The AI Infrastructure Engineer is responsible for designing, building, and owning the end-to-end infrastructure that serves the company's AI and ML models across edge deployments, cloud environments, and data center integrations. They are also responsible for designing, building, and owning the integration of power data with AI inference software. This is the first dedicated role of this kind, and will serve as the foundational function for how the company deploys and operates AI capabilities in production. The role requires deep technical expertise in ML model serving, distributed systems, and GPU infrastructure, with a strong emphasis on reliability, performance, and scalability. This position works cross-functionally with product, engineering, and data science teams and is open to fully remote candidates, with periodic travel expected for company retreats and key on-site engagements.

Responsibilities

Lead the design and build of the company's AI inference platform — establishing architecture patterns, deployment standards, and operational practices that will scale with the company

Own end-to-end model serving infrastructure for the company's AI infrastructure (on-prem and datacenter)
Build and maintain fault-tolerant, high-performance systems for serving AI models at scale, with a focus on low latency, reliability, and cost efficiency
Collaborate closely with algorithms engineers to integrate AI inference data and configuration with power optimization algorithms
Optimize GPU utilization and inference performance across our hardware fleet, including NVIDIA accelerators central to the company's edge AI platform
Establish MLOps best practices including CI/CD pipelines for model deployment, monitoring, and rollback across environments
Contribute to infrastructure roadmap decisions, including build vs. buy tradeoffs, tooling selection, and platform evolution as the team grows

Requirements

5+ years of software engineering experience with a strong focus on AI infrastructure, backend systems, or distributed systems

Hands-on experience with AI model serving frameworks (e.g., vLLM, SGLang, Triton, TensorRT, TorchServe, or similar)
Understanding of container orchestration and cluster management (Kubernetes, Docker)
Experience deploying and operating infrastructure across both datacenter and on-prem environments
Strong knowledge of GPU workloads and the tradeoffs that come with them — you understand how inference differs from training, and why it matters
Proficiency in Python; C++, CUDA, Go, Rust a plus
Excellent communication skills and comfort working cross-functionally in a lean, fast-moving environment
Willingness to travel up to 10% of time

Nice to Have

Dynamo experience a plus

Experience with edge AI deployments or constrained compute environments
Familiarity with infrastructure as code (Terraform, Helm)
Experience with observability platforms (Datadog, Prometheus, Grafana)
Background in energy, utilities, or industrial IoT
Contributions to open-source ML infrastructure projects

What the Company Offers

Diverse and inclusive workplace that is welcoming, supportive, affirming and respectful

Empowering employees to solve problems and work together to make a difference
Providing mentorship and growth opportunities as part of a collaborative team
A flexible work environment with flexible paid time off
Competitive compensation and benefits, including health, dental, vision, and employer-match 401k

Follow the Link to Apply:

https://tally.so/r/68AP8B