Skip to content
Skip to content
DevOps Jobs
J

Sr Cloud Platform Engineer HYBRID - Cedar Park, TX (Austin area)

James Avery Artisan Jewelry

Location
Hybrid (Cedar Park, TX · Cedar Park, TX)
Employment
Full-time
Level
Senior Level
Posted 1 day ago

About the Role

James Avery Artisan Jewelry is seeking a Senior Cloud Platform Engineer to lead the design and operation of enterprise platforms across on-premises and cloud environments. This role is crucial for driving platform reliability, scalability, and operational excellence through modern DevOps practices and automation.

Skills

Cloud Platform Engineering Infrastructure as Code AWS Oracle Cloud Infrastructure Kubernetes Docker Terraform CI/CD Kafka Ansible Python Bash Dynatrace Grafana Disaster Recovery Serverless Architecture

Full job details

Job Summary

The Senior Cloud Platform Engineer leads the strategic design, development, and operation of scalable, secure and resilient enterprise platforms across on-premises and cloud environments. Partnering with Architecture, Engineering, Infrastructure and Security teams, this role drives platform reliability, scalability and operational excellence through modern DevOps pipelines, container orchestration, backup and disaster recovery strategies and infrastructure automation. Responsibilities include capacity planning, incident response, root cause analysis and continuous improvement initiatives that enable high-performing engineering teams and a secure, highly available, and future-ready technology ecosystem.

WHAT YOU WILL BE DOING:

  • Own and Operate the Enterprise Platform: Maintain technical ownership of enterprise platforms across on-premises and cloud environments (AWS and Oracle Cloud), including container orchestration, middleware, platform services, lambdas, and event processors. Partner with the IT Infrastructure team to ensure a reliable, secure, and scalable platform foundation.
  • Lead Low-Level and Domain-Centric Design: Translate high-level architectural direction from the Enterprise Architecture team into detailed, domain-specific platform designs. Own low-level design decisions across platform services, container networking, storage abstractions, and cloud-native components, ensuring solutions are practical, scalable, and aligned with EA standards.
  • Integrate Cloud-Native and Platform Security: Partner with Information Security to implement and maintain security controls across cloud-native and SaaS platform components, including container security, serverless functions (Lambda), CI/CD pipeline integrity, event streaming, and access management for modern cloud services and database platforms. Ensure compliance with organizational and regulatory standards across all platform-layer services.
  • Lead Incident Response: Lead identification, triage, and resolution of platform-related incidents, coordinating with IT Infrastructure, Help Desk, and Information Security to minimize downtime, restore services swiftly, and implement preventive measures to enhance resilience.
  • Provide Technical Leadership and Mentorship: Guide platform team technical decisions, mentor team members, and drive a culture of excellence through knowledge sharing, best practices, and strategic direction.
  • Monitor and Optimize Platform Performance: Proactively monitor and optimize platform-layer performance, including container workloads, cloud services, event pipelines, and application-facing infrastructure, diagnosing and resolving complex issues to maintain high availability and efficiency. Define and meet availability SLAs/SLOs, RTO/RPO targets, and uptime commitments for platform services and supported databases, reporting against these targets on a regular cadence.
  • Drive Automation and Infrastructure as Code: Champion platform automation through infrastructure as code (IaC), scripting, configuration management (e.g. Ansible) and tooling to streamline operations, reduce manual effort, and improve infrastructure efficiency and consistency.
  • Build and Manage CI/CD Pipelines: Design, implement, and maintain CI/CD pipelines and automation frameworks that enable reliable, repeatable software delivery across engineering teams.
  • Steward Event-Driven Platform Capabilities: Own the design, operation, and reliability of event-driven and streaming platform components, including Kafka-based architectures, ensuring scalability and performance for dependent applications.
  • Enable Developer Self-Service: Build and maintain internal platform tooling, paved paths, and self-service capabilities that improve developer productivity and reduce friction for software and data engineering teams.
  • Define Capacity Requirements: Forecast platform capacity needs and define infrastructure requirements in partnership with IT Infrastructure, who executes provisioning. Analyze performance metrics to ensure current and future business demands are met while optimizing costs.
  • Lead Disaster Recovery and Continuity Planning: Design and lead disaster recovery and business continuity strategies for platform operations in partnership with IT Infrastructure, ensuring resilient backup, storage, failover, and recovery capabilities. Oversee DevOps pipeline architecture, capacity planning, incident response, and root cause analysis to minimize operational disruption and strengthen platform reliability.
  • Maintain Platform Documentation: Ensure comprehensive, current documentation of platform configurations, processes, and procedures to support knowledge transfer, compliance, and operational continuity.

WHAT IS REQUIRED:

  • Bachelor's Degree in Information Technology, Computer Science, or a related field; or equivalent combination of education and/or experience. 
  • 6 years of progressive experience in platform engineering, cloud infrastructure, or distributed systems architecture. 
  • Expert-level understanding of modern platform engineering principles, including infrastructure as code (IaC), automation, scalability, reliability engineering, performance optimization, and disaster recovery design. 
  • Demonstrated expertise with public cloud platforms. 
  • Experience designing and operating serverless architectures and event-driven platforms using technologies such as Confluent Kafka. 
  • Proven experience building and managing CI/CD pipelines and automation frameworks using tools such as GitLab and Jenkins. 
  • Hands-on experience implementing observability and monitoring solutions using tools such as Dynatrace, CloudWatch and Grafana. 
  • Experience designing and implementing storage, backup, disaster recovery (DR), and failover strategies to ensure high availability and business continuity. 
  • Experience with containerization and orchestration technologies and modern platform tooling (e.g. Docker, Fargate).
  • Strong Hands-on experience with infrastructure as code tools such as Terraform, AWS CloudFormation, OpenTofu or equivalent. 

PREFERRED QUALIFICATIONS:

  • Hands on experience with AWS and Oracle Cloud Infrastructure (OCI), as well as experience with Oracle Exascale or comparable high-performance infrastructure environments. 
  • Deep knowledge of networking concepts as they relate to cloud and platform environments, including security, encryption, identity and access management, and service connectivity across hybrid architectures. 
  • Proven expertise of database platforms and data technologies, including relational and NoSQL databases, cloud-native database services (e.g., Oracle, AWS RDS, Aurora), and experience supporting database platform operations, performance, and availability within a cloud or hybrid environment. 
  • Proficiency in Linux-based systems and scripting or programming languages commonly used for automation, including Python and Bash. 

Not the right fit?

Browse all DevOps & SRE roles.

Browse all jobs