Hajed
Khlifi

Principal Solution Architect — DevOps, Cloud, AI & HPC Infrastructure

I architect and deploy production ready, secure, large-scale platforms on air-gapped, sovereign, hybrid and public clouds using Kubernetes, Openshift, Nvidia HGX, AWS, GCP and full Entreprise tooling for regulated Entreprise Editions markets.

NVIDIA NVIDIA NCP-AIO K8s CNCF Kubestronaut ☁ AWS / GCP Architect 🎤 International Speaker
Profile Photo
7+ Years Experience
10+ Certifications
20+ Conferences participation
7+
Years in DevOps & Cloud
10+
Professional Certifications
5+
International Talks
3
Languages Spoken

Consulting Services

End-to-end AI infrastructure delivery — from GPU cluster design to production-grade Kubernetes platforms for regulated European markets.

🔸
GPU Cluster Architecture
Design and deploy high-density GPU clusters (H100/H200) on Cisco C885A and NVIDIA DGX stacks. Full NVIDIA AI stack integration with GPU Operator, CUDA, MIG, and vLLM for production inference.
NVIDIA H200 DGX Systems Cisco UCS CUDA MIG
Kubernetes & Cloud Native
Architect and automate large-scale Kubernetes platforms with custom operators, Helm charts, and GitOps pipelines. OpenShift, GKE, GDCH, and bare-metal deployments across hybrid environments.
Kubernetes OpenShift Operators Helm ArgoCD
🔒
Sovereign & Air-Gapped Cloud
Build secure, sovereign cloud platforms for European government and regulated clients. Air-gapped deployments, Google Distributed Cloud Hosted (GDCH), PKI, Vault, and full compliance frameworks.
GDCH Air-Gapped Vault PKI Compliance
🤖
AI Platform Deployment
Deploy production-grade AI suites including Mistral AI, RAG systems, Agentic AI, and LLM orchestration. Triton Inference Server, RAPIDS, vector databases, and AI Studio integration.
Mistral AI RAG Triton vLLM Vector DB
CI/CD
DevOps & Automation
Infrastructure as Code with Terraform and Ansible. CI/CD pipelines with GitLab CI, ArgoCD, and Tekton. Full observability stacks with Prometheus, Grafana, and ELK.
Terraform Ansible GitLab CI Prometheus Grafana
🎓
Training & Technical Advisory
Technical workshops, team coaching, demos, and architecture reviews. Trusted advisor for government and enterprise clients translating regulatory needs into secure technical architectures.
Workshops Demos Advisory Architecture Review

Professional Experience

From deployment analyst to principal architect — a trajectory built on deep hands-on infrastructure expertise.

Proximus Luxembourg Present
Principal Solution Architect — Cloud & AI
Luxembourg, Luxembourg
  • Promoted to lead the infrastructure team. Driving end-to-end deployment of a high-density H200 GPU cluster on Cisco C885A servers for European clients.
  • Mastered the full NVIDIA AI stack (GPU Operator, CUDA) with Harvester/Rancher to expand GPU capacity and deploy production-grade Mistral AI suites.
  • Architected and automated large-scale, secure Kubernetes platforms using custom operators, enabling scalable RAG, Agentic AI, and AI inference systems for EU clients.
  • Coaching the team through technical workshops and documentation. Full lifecycle solution delivery including on-site support.
Proximus Luxembourg 2025 – 2026
Cloud Architect
Luxembourg, Luxembourg
  • Led the strategic design of secure, sovereign cloud and AI platforms for European government clients as a trusted technical advisor.
  • Architected and deployed production-grade AI infrastructure with Mistral AI on air-gapped platforms with NVIDIA GPUs for large-scale inference and RAG systems.
  • Designed and automated Kubernetes environments on Google Distributed Cloud Hosted (GDCH) and OpenShift with custom operators.
  • Applied cloud-native best practices as a Google Cloud Partner consultant for high-availability enterprise architectures.
Sfeir Luxembourg 2024 – 2025
Senior DevOps & Cloud Consultant
Luxembourg, Luxembourg
  • Conducted demos for OpenShift AI, deploying and managing LLM models to validate scalable AI workload orchestration.
  • Upgraded production OpenShift clusters from v4.8 to v4.14 on VMware infrastructure.
  • Architected and automated infrastructure solutions, reducing manual operations by 40% with autoscaling to meet business SLAs.
  • Mentored development teams in designing highly available, large-scale microservices architectures on OpenShift.
VERMEG — Banking & Insurance 2021 – 2023
DevOps Lead Engineer
Tunis, Tunisia
  • Architected and deployed scalable AWS infrastructure (VPC, EC2, EKS, RDS) for data-intensive banking microservices.
  • Managed and optimized Kubernetes clusters on AWS EKS for containerized data-driven applications.
  • Enhanced GitLab CI/CD pipelines and implemented ArgoCD for GitOps deployment workflows.
VERMEG — Banking & Insurance 2019 – 2021
Deployment Analyst — DevOps
Tunis, Tunisia
  • Architected and managed foundational Linux infrastructure for high-availability systems.
  • Led a SaaS migration POC using Docker and Kubernetes for containerized high-availability solutions.
  • Developed automation scripts to streamline deployments and managed production releases for major clients (BNP, Cardif, MAIF).

Certifications

Industry-recognized certifications across NVIDIA AI, Kubernetes, and major cloud platforms.

NVIDIA AI Infrastructure
NVIDIA AI Infrastructure Professional (NCP-AII)
NVIDIA
In Progress
DGX Systems InfiniBand Base Command Networking Storage
NCP-AIO
NVIDIA AI Operations Professional (NCP-AIO)
NVIDIA
2026
CUDA Fabric Manager GPU Operator Run:ai Slurm MIG
Kubestronaut
Kubestronaut (Kubernetes Champion)
CNCF — Cloud Native Computing Foundation
2025
CKA CKS CKAD KCNA KCSA
Google Cloud Architect
Google Professional Cloud Architect
Google Cloud
2025
GKE Cloud Run IAM VPC BigQuery
Prometheus Certified Associate
Prometheus Certified Associate (PCA)
Linux Foundation
2024
PromQL Alertmanager Grafana Exporters Service Discovery
AWS Solutions Architect
AWS Solutions Architect Associate
Amazon Web Services
2023
EC2 EKS VPC RDS S3 IAM

Technical Skills

A deep, hands-on stack spanning GPU infrastructure, AI platforms, cloud native, and security.

🔸
NVIDIA Platform
GPU Operator DGX Systems CUDA DOCA MIG vLLM Triton Server RAPIDS NCP-AIO
🤖
AI Infrastructure
Mistral AI Claude API RAG Systems AI Studio LLM Orchestration Vector Databases Agentic AI
Kubernetes & Orchestration
Kubestronaut Helm Custom Operators OpenShift GKE CSI / CNI Harvester Rancher
Cloud Platforms
GDCH / GDCA GCP AWS OpenStack Sovereign Cloud
Infrastructure & Automation
Terraform Ansible Python Bash Cisco UCS/M8
🔒
Security & Observability
HashiCorp Vault Keycloak PKI Prometheus Grafana ELK Stack GitLab CI ArgoCD Tekton

Languages

English Fluent
French Fluent
Arabic Native

Featured Projects

Select engagements showcasing GPU infrastructure, Kubernetes platforms, AI deployments, and cloud-native engineering at scale.

🧠
KubeBrain — Multi-Agent Kubernetes AI Assistant
AI-powered Kubernetes management system built on CrewAI and MCP. Four specialized agents — Teacher, Investigator, Inspector, Executor — collaborate through workflow crews to diagnose cluster issues, validate configurations, explain concepts, and safely execute operations.
CrewAI MCP Python Streamlit Kubernetes Agentic AI
🖥
H200 GPU Cluster — EU Sovereign Cloud
End-to-end design and deployment of a high-density NVIDIA H200 GPU cluster on Cisco C885A servers. Full NVIDIA AI stack (GPU Operator, CUDA, MIG) with Harvester/Rancher running production Mistral AI inference and fine-tuning workloads at scale for European enterprise clients.
NVIDIA H200 Cisco C885A GPU Operator Mistral AI Harvester MIG
🔒
Air-Gapped AI Platform — Government
Architected full-stack sovereign cloud on Google Distributed Cloud Hosted (GDCH) — compute, networking, IAM, storage, and Kubernetes with custom operators. NVIDIA GPU-powered Mistral AI suite (Mistral Large, Codestral, embeddings) for inference, RAG, and semantic search in fully isolated environments.
GDCH Air-Gapped NVIDIA GPU Mistral Large RAG OpenShift
Insurance Policy ETL — Event-Driven Batch Platform
Kubernetes-native serverless ETL platform for insurance policy extraction and transformation. Spring Batch jobs auto-scaled with KEDA and Knative on event-driven triggers, deployed via Tekton CI/CD pipelines and ArgoCD GitOps. Fully automated infrastructure provisioning with Terraform, Ansible, and Helm.
Spring Batch KEDA Knative Tekton ArgoCD Terraform
🚀
Dapr-Powered HA Microservices Platform
Implemented Dapr (Distributed Application Runtime) as a sidecar-based runtime to simplify service-to-service invocation, state management, pub/sub messaging, and distributed tracing across a large-scale microservices environment — enabling high availability and cloud portability without vendor lock-in.
Dapr Kubernetes Pub/Sub State Management OpenShift Observability
Kubernetes Platform — Enterprise Scale
Architected large-scale Kubernetes platforms with custom operators, integrated observability, and standardized deployment patterns. Powers scalable RAG, Agentic AI, and LLM serving pipelines with reusable platform components adopted across multiple engineering teams.
Kubernetes Custom Operators OpenShift Helm ArgoCD
AWS Banking Microservices — VERMEG
Led the architecture of scalable AWS infrastructure (VPC, EC2, EKS, RDS) for data-intensive banking microservices. GitOps deployment with ArgoCD and optimized EKS clusters serving demos and POCs for major financial clients including BNP, Cardif, and MAIF.
AWS EKS Terraform ArgoCD GitLab CI RDS

International Speaking

Sharing knowledge on cloud-native and AI infrastructure at leading industry conferences across Europe and the US.

Cloud Native Days
Bologna, Italy
2026
Container Days
London, UK
2026
KCD New York
New York, USA
2025
Container Days
Hamburg, Germany
2025
Voxxed Days
Luxembourg, Luxembourg
2025
Cloud Native Days
Luxembourg, Luxembourg
2024
View Full Speaker Profile →
sessionize.com/hajed-khlifi

Academic Background

🎓
Masters of Cloud architecture Engineering - Evening courses
Esprit University - Tunis, Tunisia
2019 – 2022
🎓
Bachelor's Degree in Computer Systems & Networking
University of Monastir — Monastir, Tunisia
2016 – 2019
🎓
Bachelor's Degree in Mathematics
Lessouda Superior school
2013 – 2016

Get in Touch

Looking for a GPU infrastructure architect or Kubernetes expert? I'm available for freelance consulting engagements across Europe.

📞
📍
Location
Luxembourg, Luxembourg
🔗
💻
Availability
Open to freelance consulting contracts across the EU. Available for remote engagements with on-site support as needed. Short-term architecture reviews to long-term platform delivery.

Or email me directly at Kh.hajed@gmail.com