Artem Yushkovskiy's CV

Email: atemate@duck.com
Location: Berlin, Germany
LinkedIn: atemate
GitHub: atemate

Summary

AI Platform Engineer and Tech Lead with 10+ years spanning security research, ML platforms, and high-scale distributed AI systems. Led a squad of 10, driving technical direction for AI infrastructure at global scale: self-hosted image generation (10k+ of images nightly in batch, as well as live near-realtime streaming), 2x 20+ fraud detection models across 70+ countries (<10ms latency), internal ML platform on top of GCP Vertex AI used by 5+ teams. Designed and open-sourced asya.sh, a Kubernetes-native actor mesh for AI workloads presented at KubeCon EU 2026. Looking to build global AI platforms that let teams ship, test, and scale AI and ML workloads fast.

Experience

Sr AI Engineer - Distributed AI Orchestration, GenAI (images), Delivery Hero -- Berlin, DE

Aug 2023 – present

Owned architecture decisions for a 10-person squad (5 engineers, 5 data scientists); championed modular AI over monolithic models: smart routing to specialized models per problem type rather than one large black-box model, enabling divide-and-conquer at scale with clear ownership boundaries (engineers own infra/pre-processing, data scientists own model logic and deployment via platform tooling)
Designed and scaled a self-hosted AI image processing platform on AWS/Kubernetes; GPU-bound multi-model pipelines per image (production models, scoring, analysis, guardrails), self-hosted under strict cost constraints; 100K+ images nightly in batch, migrated to near-realtime streaming (<5 min SLA, shrinking) with zero application code changes; early A/B tests show measurable lift in clicks and purchases at global scale
Designed CI/CD for monorepo with 15+ model packages building multi-GB GPU images; reduced end-to-end deployment cycle from days to ~1 hour; build time 4h → 20 min (cache miss), 1h → 6 min (cache hit); for some workloads pod startup time reduced 20 min → 3 min; image sizes 3-4x smaller [Docker Buildx, uv, optimized layer caching]
Recognized the actor mesh pattern's general value for AI and agentic workloads beyond image processing; drove the initiative to re-implement and open-source it as asya.sh; developed using high-velocity AI-assisted workflow over ~3 months (core actors → Crossplane migration → agentic features → DevX); now live and being adopted internally
Authored company-wide RFC on GPU capacity and cost strategy (training + serving): 18-36 month demand forecasting, usage patterns across teams, negotiation baseline for cloud provider agreements; defined SLA targets and cost governance for GPU workloads
Mentored engineers through two promotions (junior → mid, mid → senior); continuously coaching data scientists on engineering best practices (testing, CI/CD, clean code) to raise tech culture across the team; active contributor to ML chapter and weekly operations reviews, driving cross-team knowledge sharing

ML Engineer, ML Platform → Fraud Detection, Delivery Hero -- Berlin, DE

Nov 2021 – Aug 2023

Designed and built an internal ML Platform on GCP Vertex AI, providing shared data and ML pipeline infrastructure across teams (CLV, food ontology, dish normalization, email suspicion, fraud detection)
Owned full lifecycle of 20+ parameterized ML models for voucher fraud detection across 70+ countries (4 continents): built end-to-end CI/CD enabling data scientists to iterate fast against rapidly shifting fraud patterns (nightly batch pipelines on BigQuery/Airflow, per-country model training on Vertex AI, automated evaluation with data drift tracking via Evidently, results reported to GitHub Actions for promotion decisions), saving $100K+/month
Two-model serving pipeline: email suspicion (DynamoDB caching) → fraud scoring consuming multiple signals → rule-based system; FastAPI + XGBoost, optimized to <10ms per request on every food order
Engineered dataset versioning for strict model comparability without data leakage; re-implemented across two teams, then proposed as a company-wide standard (RFC adoption in progress)
On-call SRE for fraud models and global Customer Data Platform (low-latency feature aggregation, Flink, DynamoDB); monitoring and observability (DataDog, OpsGenie)
Co-authored with Google: "How Delivery Hero connected GitHub with Vertex AI to manage 20+ voucher fraud detection models" (Google Cloud Blog, Nov 2023)

Software Engineer → MLOps Engineer, Neuromation / Neu.ro -- St Petersburg, RU

2018 – 2021

Built core of a multi-cloud Kubernetes-based ML platform ("Docker-like experience on any cloud and bare metal"): REST API microservices, CLI/SDK, user management, distributed storage, resource orchestration
Prototyped and integrated ML tooling into the platform (DVC, Seldon, Feast, Pachyderm); ran internal Kubernetes seminars for the team
Led production computer vision projects for clients (object detection, classification, RabbitMQ-driven inference, MongoDB, Kubernetes)

Security Researcher → Software Engineer, ptsecurity, Aalto University

2015 – 2018

Automated QA of web security scanners and application firewalls (DAST, SAST, WAF), vulnerability search, static code analysis, abstract interpretation, SMT solvers
Portability analysis of C-like programs (MSc thesis project)

Projects

Asya🎭 - Open-source Kubernetes-native Actor Mesh for AI Orchestration

Nov 2025

Distilled from 3+ years of production AI workloads; Crossplane-based, sidecar architecture (Python runtime + Asya sidecar), pluggable transports (SQS, RabbitMQ, Pub/Sub), pluggable state backends (S3, GCS, Redis, NATS KV)
Stateless actors with dynamic routing ("the message knows the way"); flow compiler translates Python control flow into actor graphs; KEDA-based per-actor autoscaling including scale-to-zero
Gateway protocols: A2A agents, MCP tools, SSE streaming, REST; pure Python handlers with zero framework coupling, testable locally
Clean separation of business logic (data scientists) and infrastructure config (platform engineers), solving the ownership handover problem in AI teams ("two files, two owners")
asya.sh | GitHub

Public Speaking

KubeCon EU 2026 Amsterdam (13k+), AI Infra Summit Munich (500+), AI in Production Berlin (150+), Agents in Production virtual
MLOps Community Berlin meetup organizer; led Ask-Me-Anything initiative in MLOps Community Slack (2021-2023)

Education

Aalto University (Helsinki) & ITMO University (St Petersburg), MSc in Computer Science

2016 – 2018

Double-degree programme: Information Security and Cloud Computing, with honors.
Thesis: Automated Analysis of Weak Memory Models.

ITMO University (St Petersburg), BSc in Computer Science

2012 – 2016

Information Security, with honors.
Thesis: Development of the Code Property Graph Construction Module for the Static Analyser ApplicationInspector.Net

Skills

AI Infrastructure & Inference: self-hosted model serving, inference optimization, GPU orchestration (KEDA, scale-to-zero), batch & streaming inference, model deployment, async actor-based pipelines, AI guardrails, image generation (SDXL), LLM integration, vLLM, Ray Serve, Triton, PyTorch, HuggingFace, ONNX

Platform Engineering & Developer Experience: Kubernetes (CRDs, operators, Helm, Kustomize), GitOps (ArgoCD, Flux), IaC (Terraform), CI/CD (GitHub Actions, Drone CI, Docker Buildx, Kaniko), internal developer platform, multi-tenancy, RBAC, cost optimization, observability (OpenTelemetry, Prometheus, Loki, Grafana, DataDog), alerting (OpsGenie), secret management (Vault)

MLOps & Data: ML lifecycle (Vertex AI, MLflow, W&B), pipeline orchestration (Airflow, KFP, Flyte, Metaflow), feature stores, dataset versioning (DVC), A/B testing, experimentation, data quality (Great Expectations, Evidently), vector search (Redis, Milvus), stream processing (Flink), low-latency model serving (<10ms)

Programming & Cloud Infrastructure: Python (asyncio, FastAPI, uv), Go, Bash, Linux, AWS (EKS, SQS, SNS, Lambda, ECR), GCP (Vertex AI, Pub/Sub, Cloud Run, Cloud Build, GCS, Artifact Registry), PostgreSQL, DynamoDB, Redis, RabbitMQ, NATS, MongoDB

AI-first Workflow & Agent Architecture: built asya.sh (~3 months) using AI-first workflow (Claude Code CLI, parallel sandboxed agents, git worktree isolation), deep knowledge of agent framework internals (ADK, A2A, MCP), eval pipelines, human-in-the-loop workflows, Langfuse, large-scale codebase automation

Spoken Languages: English, Russian (fluent); French (intermediate); German (learning)

Certifications: GCP Professional Architect (2025), GCP Professional ML Engineer (2023, expired), KubeCon + CloudNativeCon Speaker (2026)