Department: Engineering
Team: DevOps / Infrastructure
Reports to: Head of Engineering / CTO
Company: Odin AI (https://getodin.ai)
Role Overview
Odin AI is seeking a DevOps Lead to architect, manage, and scale our infrastructure with a strong emphasis on security, automation, and scalability. This role is especially critical as many of our enterprise deployments involve on-premise and air-gapped environments. The ideal candidate will bring deep technical expertise and a proactive, solution-driven mindset to help us deliver reliable AI solutions to our clients.
Must-Have Criteria
Total Experience: Minimum 7 years in DevOps or infrastructure roles, with 7+ years of hands-on experience in a DevOps capacity.
Cloud Certification: Must hold at least one active certification from:
○ AWS (preferred)
○ Google Cloud Platform
○ Microsoft Azure
Core Technical Proficiencies:
○ Linux system administration
○ Computer networking (in-depth): TCP/IP, DNS, routing, VPNs, and firewalls
○ AWS Services: VPC, IAM, EC2, ECS/EKS, S3, etc.
○ Kubernetes: Deployments, Helm/Kustomize, Operator patterns
○ Docker: Image creation, optimization, and container security
○ Terraform: Modular IAC, remote state management, multi-environment setups
○ CI/CD Pipelines: GitHub Actions, GitLab CI, ArgoCD (or similar)
○ Scripting: Proficiency in Bash and Python for automation
Preferred (Bonus) Experience
Experience with air-gapped or fully on-premise infrastructure
Familiarity with GitOps tools (e.g., ArgoCD, FluxCD)
Exposure to Secrets Management (e.g., HashiCorp Vault, SOPS)
GPU provisioning and model serving experience, especially for LLM/MLOps workloads
Familiarity with Service Mesh tools (e.g., Istio, Linkerd)
Strong understanding of security best practices for both cloud and on-prem deployments
Exposure to LLM pipelines or MLOps tooling
Key Responsibilities
Design, implement, and maintain secure and scalable infrastructure across cloud and on-prem environments
Lead CI/CD pipeline development and infrastructure automation
Own infrastructure-as-code using Terraform and Git-based workflows
Ensure high standards of network, system, and application security—especially in enterprise environments
Manage monitoring, observability, and rollback strategies for reliable deployment
Provide technical mentorship to junior engineers or DevOps team members as needed
Soft Skills & Cultural Fit
Self-motivated, accountable, and proactive
Strong written and verbal communication skills across technical and non-technical teams
Comfortable working in a fast-paced, startup environment
Strong operational discipline with a focus on security and compliance
Able to effectively collaborate with client-side IT and security teams