Infrastructure / Site Reliability Engineer (SRE)
Solvd
- Location
- Remote (Georgia)
- Employment
- Full-time
- Level
- Mid Level
About the Role
Solvd Inc. is a growing AI-native consulting firm focused on cloud, data, software engineering, and AI. They are seeking an Infrastructure/Site Reliability Engineer to enhance cloud infrastructure scalability, reliability, and deployment automation.
Skills
Perks
- Remote OK
Full job details
Solvd Inc. is a rapidly growing AI-native consulting and technology services firm delivering enterprise transformation across cloud, data, software engineering, and artificial intelligence. We work with industry-leading organizations to design, build, and operationalize technology solutions that drive measurable business outcomes.
Following the acquisition of Tooploox, a premier AI and product development company, Solvd now offers true end-to-end delivery—from strategic advisory and solution design to custom AI development and enterprise-scale implementation. Our capability centers combine deep technical expertise, proven delivery methodologies, and sector-specific knowledge to address complex business challenges quickly and effectively.
We are looking for a talented Infrastructure / Site Reliability Engineer (SRE) to join our engineering team. In this role, you will be the driving force behind our cloud infrastructure scalability, reliability, and deployment automation.
You are an engineer who views infrastructure as a software problem. Instead of manually configuring servers, you build automated pipelines, treat Infrastructure as Code (IaC) as a religion, and architect self-healing cloud deployments. You will collaborate closely with development teams to bridge the gap between code generation and production stability.
What you'll do
Cloud Architecture & Infrastructure as Code (IaC)
Cloud Management: Design, provision, and maintain secure, scalable, and highly available cloud infrastructure (primarily AWS, GCP, or Azure).
Immutable Infrastructure: Write and maintain modular, clean Terraform or OpenTofu scripts to ensure all infrastructure is fully auditable and reproducible.
Container Orchestration: Manage and optimize containerized environments using Docker and Kubernetes (EKS/GKE), focusing on resource allocation and scaling policies.
Automation & CI/CD Pipelines
Deployment Automation: Build, maintain, and secure robust CI/CD pipelines (e.g., GitHub Actions, GitLab CI, Jenkins) to support zero-downtime deployments.
GitOps & Tooling: Implement modern GitOps workflows (e.g., ArgoCD, Flux) to automate application delivery and configuration management.
Scripting: Develop custom internal tools and automation scripts using Python, Go, or Bash to eliminate toil and repetitive manual tasks.
Observability & Reliability Engineering
Monitoring & Alerting: Design and implement comprehensive observability stacks using tools like Prometheus, Grafana, Datadog, or New Relic.
Performance Tuning: Conduct chaos engineering, load testing, and bottleneck analysis to ensure system resilience under heavy traffic.
On-Call & Incident Response: Participate in an engineering on-call rotation, driving root-cause analysis (Blameless Post-Mortems) to prevent incident recurrence.
What you bring
Experience: 3+ years of experience in an SRE, DevOps, or Cloud Infrastructure role.
Cloud Proficiency: Deep production experience with at least one major cloud provider (AWS, GCP, or Azure).
IaC & Containers: Strong proficiency with Terraform and hands-on experience managing production Kubernetes clusters.
Linux Systems: Solid understanding of Linux networking, internals, storage, and security fundamentals.
Preferred Technical Stack
Languages: Strong coding skills in Go or Python.
Networking: Good grasp of VPC architecture, DNS, load balancers (ALB/NLB), and Content Delivery Networks (CDNs).
Data Layer: Familiarity with managing cloud-native databases (PostgreSQL, RDS) and caching layers (Redis, Memcached).
When you join Solvd, you'll…
Shape real-world AI-driven projects across key industries, working with clients from startup innovation to enterprise transformation.
Be part of a global team with equal opportunities for collaboration across continents and cultures.
Thrive in an inclusive environment that prioritizes continuous learning, innovation, and ethical AI standards.
Ready to make an impact?
If you're excited to build things that matter, champion responsible AI, and grow with some of the industry’s sharpest minds. Apply today and let’s innovate together.
Solvd is an equal opportunity employer.
I agree to the processing of my personal data given in the recruitment process by Solvd Inc., with its principal place of business at 1646 N California Blvd, Suite 515, Walnut Creek, CA 94596, United States, for the purpose of future recruitment processes.
You can withdraw your consent at any time, however it will not affect the lawfulness of the processing performed on this basis prior to such withdrawal.
The controller of your personal data is Solvd Inc., with its principal place of business at 1646 N California Blvd, Suite 515, Walnut Creek, CA 94596, United States. More information on processing your personal data you can find in the Privacy Policy.
Not the right fit?
Browse all DevOps & SRE roles.