Skip to main content
petruarakiss
aboutprojectswritingCV
CV

Production AI · full-time, contract, fractional · Madrid · remote EU

I build AI for when it has to be right.

Madrid-based AI engineer. Twenty years in software, ML since 2015. I build retrieval, agent runtimes, guardrails, evals, and operator screens for systems exposed to production traffic, permission checks, slow APIs, bad context, and support escalation.

Email PetruCVLinkedInGitHub
Currently
AI Engineering Lead · Atlax360
Experience
20 yrs software · ML since 2015
Focus
Production LLM · RAG · agents · evals · guardrails
Based
Madrid · Remote across EU

01 / background

Banking-system discipline for production AI.

I have worked in software for twenty years and in ML since 2015. Before AI, I shipped for banks and retailers, where defects hit money movement, customer data, compliance review, and release windows. That is the lens I bring to retrieval quality, runtime determinism, guardrails, audit trails, and operator interfaces.

20 years

Full-stack architecture and platform work behind financial and B2B products, before moving into production AI.

At Atlax360 I lead AI engineering across three systems: BIFROST (document intelligence), ORVIAN (a multi-tenant workflow runtime), and Polaris (an internal knowledge assistant). They run inside live financial operations; I set the retrieval, evaluation, and guardrail standards the three share, and keeping them reliable under load is the job.

I use coding agents and language models heavily in my own work, and I design around malformed input, missing permissions, rate limits, timeouts, and tool failures. The system should fail closed and keep the prompt, retrieved sources, tool calls, permissions, latency, and decision path inspectable.

Full backgroundProject inventory
Worked with
Banks & retailers

Led engineering teams and full rewrites inside regulated financial environments and high-traffic retail.

12–20 engineer teams

Ran cross-functional platform teams through complex shipping schedules and legacy transitions.

IBM & Linux Foundation

IBM Machine Learning Professional and Linux Foundation Node.js developer certifications.

ClientsBBVASantanderBankinterDecathlonEl Corte Inglés

02 / current work

Three systems I lead at Atlax360.

Most of what I build sits behind enterprise firewalls. These three show the kind of architecture I work on: ingestion pipelines, a multi-tenant runtime, and a guardrailed assistant.

01

BIFROST

document intelligence

Document intelligence for the documents finance actually runs on: ingestion quality gates, semantic and visual chunking, pgvector/HNSW search, caching, source-quality scoring, analytics, and explicit no-answer paths when evidence is weak.

02

ORVIAN

workflow runtime

Multi-tenant AI workflow runtime: context assembly, durable memory, deterministic and cached execution tiers, queue processing, idempotency, run events, and human-review metadata when automation should stop.

03

Polaris

internal assistant

Internal assistant that combines BIFROST retrieval with guardrails, citations, streaming UX, suggestion revalidation, and operator analytics. One surface for support, sales, and product teams.

03 / open source

Deterministic utilities, shared publicly.

The patterns behind my production work, extracted into Rust and TypeScript libraries: permission boundaries, trace evidence, governed memory, and structured observability.

gommage / Rust

Deterministic policy engine for AI coding agents: maps tool calls to capabilities, evaluates YAML rules, and signs every decision in a verifiable audit log, with hard-stops that policy can't bypass.

github.com/Arakiss →
nahuali / Rust

Self-inspecting, auditable memory for AI agents: surfaces the evidence, provenance, and health behind each recall so callers can see which memory to trust, with an optional Ed25519-signed tamper-evident ledger. Local-first, Rust.

github.com/Arakiss →
traceframe / Rust

Local-first trace recorder for AI agent runs: append-only, verifiable evidence of what the agent called, what it was allowed, and what failed, with hook ingestion for Codex/OMX harnesses.

github.com/Arakiss →
vestig / TypeScript

Runtime-agnostic structured logging with automatic PII sanitization (GDPR/HIPAA/PCI-DSS) and native W3C tracing. Zero dependencies; runs on Node, Bun, Deno, Edge, and the browser.

github.com/Arakiss →
greco / Rust

Research harness exploring whether a coding-agent harness can measurably improve itself through typed, layered modifications validated against operator-defined evals within strict budgets.

github.com/Arakiss →

04 / what i work on

From index to interface.

I take one production AI path from retrieval index to operator workflow. The scope includes runtime state, evals, guardrails, latency, permissions, and failure handling in the product UI.

Retrieval

Chunking, metadata, permissions, source quality, citations, and caching, with explicit refusal when retrieved sources are weak, contradictory, missing, or out of scope.

Agent runtimes

Tool boundaries, stopping conditions, traces, handoffs, evaluators, queues, and cost control, defined before an agent reaches production.

Evaluation

Eval sets, abstain logic, and regression traces. The discipline of proving a change helped instead of assuming it did.

Compliance-aware operations

Permissions, audit trails, PII handling, and human-in-the-loop review for AI running inside regulated finance, where “mostly right” isn't good enough.

Harness engineering

Repository context, executable plans, browser checks, CI gates, and review logs around coding agents, so their output stays verifiable inside a real codebase.

Product surfaces

Next.js, TypeScript, streaming UX, and operator screens for model failures, showing retrieved evidence, failure state, audit trail, escalation path, and the next valid action.

Read field notes

05 / how i work

How I work.

I work async and stay accountable for retrieval quality, runtime behavior, latency, cost, permissions, and operator workflows. Madrid-based, remote-first across the EU. Open to Staff, Principal, Architect, and Forward Deployed roles, and to contract or fractional engagements, where retrieval, agents, evals, and guardrails are part of the shipped product.

Production stack

  • Python · FastAPI · TypeScript · Next.js
  • PostgreSQL · pgvector · Redis · Supabase
  • OpenAI · Anthropic · Vercel AI SDK
  • Evals · traces · guardrails · observability

Good fit when

  • Teams shipping AI inside regulated or operationally heavy environments
  • Roles that need retrieval, runtime, evals, and product judgment in one architect
  • Organizations past the demo phase and into reliability, cost, and permissions

06 / questions

Common questions.

Quick facts on stack, roles, and fit. The CV has the full timeline.

Download CV

07

Hire me to ship your retrieval or agent system into production.

I work with teams shipping LLM features that need owned retrieval quality, cost ceilings, latency targets, permission checks, and failure review. Available full-time, contract, or fractional. Madrid-based, remote across the EU.

Available for full-time, contract, and fractional work
|
Madrid · Remote across the EU
Get in touchCV
© 2026 Petru Arakiss · MadridAI engineer · full-time, contract, fractional · remote across the EU
aboutprojectswriting