AI Platform Engineer - TS/SCI with CI Poly
PGTEK
- Location
- Onsite (McLean, Virginia)
- Compensation
- $125k - $185k/yr
- Employment
- Full-time
- Level
- Senior Level
About the Role
PGTEK is a consulting organization focused on helping clients achieve their technology objectives. This role is critical in building, maintaining, and optimizing infrastructure for advanced AI workloads, working closely with engineering, operations, and security teams.
Skills
Benefits
- Medical Coverage
- Vision Plan
- Dental Insurance
- Life Insurance
- Disability
- 401(k) Matching
- PTO
- Holidays
Perks
- HSA option
- Pet Insurance Discount
- Employee Assistance Program
- Education Assistance
Full job details
Location: McLean, VA (Onsite 5 Days per Week)
Employment Type: Full-Time
Salary Range: $125,000 - $185,000
Clearance Required: Active TS/SCI with Counterintelligence Polygraph
Certification Requirement: Current IAM Level II certification meeting DoD 8570 IAT requirements
We are seeking an experienced AI Platform Engineer to play a critical role in building, maintaining, securing, and optimizing the infrastructure that supports advanced Artificial Intelligence (AI) workloads. This individual will be responsible for designing and managing scalable Kubernetes environments, implementing automated deployment pipelines, and ensuring platform reliability, security, and performance.
The ideal candidate combines deep expertise in cloud-native technologies, Kubernetes administration, DevOps practices, and automation with strong problem-solving and collaboration skills. This role will work closely with engineering, operations, and security teams to deliver highly available AI platform solutions in a mission-critical environment.
Key ResponsibilitiesKubernetes & Platform Engineering- Design, deploy, secure, maintain, and upgrade highly available Kubernetes clusters across cloud and on-premises environments.
- Manage Kubernetes control plane components, worker nodes, and supporting infrastructure.
- Implement and maintain containerized workloads using Docker and Kubernetes best practices.
- Configure and manage Kubernetes resources including Pods, Deployments, StatefulSets, Services, Ingress, ConfigMaps, Secrets, Persistent Volumes, and Namespaces.
- Support advanced networking configurations, including CNI plugins, network policies, service meshes, and DNS services.
- Implement security best practices across Kubernetes environments.
- Manage RBAC, admission controllers, vulnerability scanning, secret management, and network security controls.
- Ensure platform compliance with government and organizational security requirements.
- Support secure deployment practices and infrastructure hardening initiatives.
- Design, implement, and maintain CI/CD pipelines for containerized applications.
- Utilize GitOps methodologies and tools to automate application deployment and platform management.
- Develop infrastructure as code (IaC) solutions using Terraform, Pulumi, CloudFormation, or similar tools.
- Create automation scripts and tooling using Python, Go, Bash, or related languages.
- Implement monitoring, logging, alerting, and observability solutions across platform environments.
- Diagnose and resolve complex performance issues affecting Kubernetes clusters and applications.
- Optimize resource utilization and platform scalability.
- Support distributed tracing, centralized logging, and operational analytics initiatives.
- Apply DevOps and Site Reliability Engineering (SRE) principles to improve platform resilience and operational excellence.
- Collaborate with development, operations, security, and infrastructure teams.
- Lead technical initiatives and mentor junior engineers.
- Drive continuous improvement efforts across platform engineering and deployment practices.
- Communicate effectively with technical and non-technical stakeholders.
- Extensive experience designing, deploying, and managing Kubernetes environments (EKS, AKS, GKE, OpenShift, or self-managed clusters).
- Advanced knowledge of Docker and containerization technologies.
- Strong understanding of Kubernetes networking, service meshes, and cluster architecture.
- Expertise in Kubernetes security, access controls, and secret management.
- Experience with CI/CD platforms such as Jenkins, GitLab CI/CD, GitHub Actions, Tekton, Argo Workflows, or similar.
- Proficiency with Infrastructure as Code tools including Terraform, Pulumi, or CloudFormation.
- Strong scripting and automation experience using Python, Go, Bash, or similar languages.
- Experience with GitOps tools such as Argo CD.
- Hands-on experience with monitoring and observability platforms including Prometheus, Grafana, ELK/OpenSearch, Datadog, or Splunk.
- Strong Linux/Unix administration background.
- Solid understanding of networking concepts including TCP/IP, DNS, HTTP, and load balancing.
- Expert-level Git and version control experience.
- Exceptional troubleshooting and analytical problem-solving abilities.
- Strong verbal and written communication skills.
- Ability to work effectively in cross-functional teams.
- Experience mentoring engineers and leading technical efforts.
- Strong sense of ownership and accountability.
- Adaptability and commitment to continuous learning.
- Certified Kubernetes Administrator (CKA)
- Certified Kubernetes Application Developer (CKAD)
- Certified Kubernetes Security Specialist (CKS)
- AWS Certified DevOps Engineer
- Azure DevOps Engineer Expert
- Experience developing Kubernetes Operators and Custom Resource Definitions (CRDs)
- Experience building Internal Developer Platforms (IDPs)
- Familiarity with testing methodologies including unit, integration, and end-to-end testing
- Up to 20% travel as required for on-site installations, maintenance, and troubleshooting activities at customer locations or data centers.
Our comprehensive benefits package for full-time salaried employees is effective immediately upon the start date. Benefits include comprehensive PPO medical coverage with access to a Health Savings Account (HSA) option, a vision plan, and dental insurance with the base dental plan option paid for by PGTEK. Life Insurance, Short and Long-Term disability, and Critical Illness insurance have premiums covered. Additionally, PGTEK offers a matching 401(k) plan and a discount on pet insurance through ASPCA Pet Insurance. An Employee Assistance Program is available at no cost to all employees. PGTEK offers a generous amount of PTO and Holidays, and an Education Assistance Program is available after 12 months of employment.
ABOUT PGTEK
PGTEK is a true consulting organization dedicated to helping clients achieve their business and technology objectives utilizing our decades of experience and business relationships. PGTEK invests in the educational advancements of our staff by providing the necessary resources to complete Professional and Business Certifications. Our company is our people, and we treat them like family.
EOE, including disability/veterans
Not the right fit?
Browse all DevOps & SRE roles.