The Machinist

Applied Machine Learning Engineer • End-to-end ML pipelines • Predictive systems • Decision automation • Risk modeling

About

I design and build machine learning systems and agentic AI pipelines that convert raw data into decisions, tools, and automated workflows. My foundation is end-to-end ML — data engineering, feature design, model development, and production automation — with a focus on predictive systems, risk modeling, and operational intelligence.

I'm building through a structured 6-phase agentic AI curriculum, each phase shipping a real working system. Phase 1 built a CLI research agent from scratch — raw tool-calling loops, web search, and structured report generation. Phase 2 added persistent knowledge retrieval: hybrid RAG with BM25 and dense search, cross-encoder reranking, claim verification, and a persistent memory layer.

Phase 3 — a self-correcting code-fix agent that executes Python scripts, diagnoses failures, patches them, and retries autonomously using a LangGraph state machine with human-in-the-loop approval, LangSmith tracing, and a 95% fix rate across 20 test cases. Phase 4 moves into real-world workflow pipelines, Phase 5 covers production deployment with FastAPI and LangFuse, and Phase 6 is a controlled multi-agent experiment comparing single-agent vs multi-agent performance on the same tasks.

I'm interested in systems that anticipate failures, learn from context, and operate reliably in production — predictive maintenance, autonomous research pipelines, or agents that debug and fix themselves.

Skills

Machine Learning & AI

  • Predictive Modeling & Risk Systems
  • Time Series Analysis

Agentic AI Systems

  • LLM Tool Calling & Agent Loops
  • RAG — Hybrid Retrieval & Reranking
  • LangGraph State Machines
  • Claim Verification & Grounding

Data & Engineering

  • Python, SQL, PostgreSQL
  • ETL Pipeline Design
  • Feature Engineering
  • Statistical Analysis

Infrastructure & Delivery

  • GitHub Actions & CI Automation
  • Streamlit & FastAPI
  • Netlify Deployment
  • Structured Run Logging & Cost Tracking

Learning Log

Knowledge Agent with Hybrid RAG — Project Deep Dive

Phase 2 upgrade of the research agent into a persistent knowledge system with hybrid retrieval (dense + BM25), cross-encoder reranking, and claim verification. This architecture moves from one-shot web lookup to reusable memory, improving routing reliability and grounded answer quality.

CLI Research Agent — Project Deep Dive

A language model that drives a multi-step research workflow autonomously. No frameworks, just direct API calls, a messages array, and a loop. Built to understand how agents actually work at the mechanical level.

Supervised Learning Models — Concept Exploration

A breakdown of the model families you reach for most in production —> linear models, tree-based ensembles, and gradient boosting. Deep dives on each.

LightGBM — A Practitioner's Guide

How LightGBM actually works — leaf-wise growth, histogram binning, and why it often beats XGBoost on tabular data. Includes practical tuning notes and common pitfalls.

Data Preprocessing — Concept Exploration

A systematic series on preparing data for machine learning —> feature scaling, encoding, missing data, and outlier treatment.

Data Preprocessing — Feature Scaling Deep Dive

Why scaling matters, when it doesn't, and how to pick the right method for your model. A concept exploration covering StandardScaler, MinMaxScaler, RobustScaler, and the assumptions each one makes about your data.

California Housing Price Prediction — ML Deep Dive

A hands-on ML learning series covering data loading, EDA, visualization, feature engineering, and stratified sampling — concept by concept.

Spaceship Titanic SQL Case Study

SQL-powered analysis uncovering how CryoSleep dictated passenger outcomes, debunking "planet" and "deck" myths, and revealing one true spatial anomaly.

Contact

Open to full-time opportunities and collaborative projects.