Projects
Code-Fix Agent: Self-Correcting AI System
A goal-directed AI agent that takes broken Python scripts, executes them, diagnoses the failure, patches the code, and retries — autonomously. Built with a LangGraph state machine, human-in-the-loop approval, LangSmith tracing, and a 95% fix rate across 20 diverse error types. Phase 3 of a 6-phase agentic AI curriculum.
Bosch Production Line: Predictive Quality Control
End-to-end ML pipeline predicting manufacturing failures on 1.18M rows with 171:1 class imbalance. Engineered path features revealing 72× failure rate signal — certain station paths fail at 41.7% vs 0.58% global mean. Chunk-aware CV and phased feature roadmap progressing from MCC 0.19 → 0.33, targeting ≥ 0.52.
Silent Recalls: Live Vehicle Safety Monitoring
Production-grade ETL pipeline monitoring NHTSA complaints with live risk tracking. Automated detection of vehicles with dangerous complaint-to-recall ratios. GMC Sierra 1500: 445 complaints, zero recalls. Weekly automated runs with hash-based alerting.
Bearing Failure Prediction: 2.88h Accuracy
Production-grade ML system predicting bearing RUL with 2.88-hour accuracy in critical zones. 10x improvement through weighted loss optimization. $300K annual savings, 98.5% failures prevented.
About
I design and build machine learning systems and agentic AI pipelines that convert raw data into decisions, tools, and automated workflows. My foundation is end-to-end ML — data engineering, feature design, model development, and production automation — with a focus on predictive systems, risk modeling, and operational intelligence.
I'm building through a structured 6-phase agentic AI curriculum, each phase shipping a real working system. Phase 1 built a CLI research agent from scratch — raw tool-calling loops, web search, and structured report generation. Phase 2 added persistent knowledge retrieval: hybrid RAG with BM25 and dense search, cross-encoder reranking, claim verification, and a persistent memory layer.
Phase 3 — a self-correcting code-fix agent that executes Python scripts, diagnoses failures, patches them, and retries autonomously using a LangGraph state machine with human-in-the-loop approval, LangSmith tracing, and a 95% fix rate across 20 test cases. Phase 4 moves into real-world workflow pipelines, Phase 5 covers production deployment with FastAPI and LangFuse, and Phase 6 is a controlled multi-agent experiment comparing single-agent vs multi-agent performance on the same tasks.
I'm interested in systems that anticipate failures, learn from context, and operate reliably in production — predictive maintenance, autonomous research pipelines, or agents that debug and fix themselves.
Skills
Machine Learning & AI
- Predictive Modeling & Risk Systems
- Time Series Analysis
Agentic AI Systems
- LLM Tool Calling & Agent Loops
- RAG — Hybrid Retrieval & Reranking
- LangGraph State Machines
- Claim Verification & Grounding
Data & Engineering
- Python, SQL, PostgreSQL
- ETL Pipeline Design
- Feature Engineering
- Statistical Analysis
Infrastructure & Delivery
- GitHub Actions & CI Automation
- Streamlit & FastAPI
- Netlify Deployment
- Structured Run Logging & Cost Tracking
Learning Log
Knowledge Agent with Hybrid RAG — Project Deep Dive
Phase 2 upgrade of the research agent into a persistent knowledge system with hybrid retrieval (dense + BM25), cross-encoder reranking, and claim verification. This architecture moves from one-shot web lookup to reusable memory, improving routing reliability and grounded answer quality.
CLI Research Agent — Project Deep Dive
A language model that drives a multi-step research workflow autonomously. No frameworks, just direct API calls, a messages array, and a loop. Built to understand how agents actually work at the mechanical level.
Supervised Learning Models — Concept Exploration
A breakdown of the model families you reach for most in production —> linear models, tree-based ensembles, and gradient boosting. Deep dives on each.
LightGBM — A Practitioner's Guide
How LightGBM actually works — leaf-wise growth, histogram binning, and why it often beats XGBoost on tabular data. Includes practical tuning notes and common pitfalls.
Data Preprocessing — Concept Exploration
A systematic series on preparing data for machine learning —> feature scaling, encoding, missing data, and outlier treatment.
Data Preprocessing — Feature Scaling Deep Dive
Why scaling matters, when it doesn't, and how to pick the right method for your model. A concept exploration covering StandardScaler, MinMaxScaler, RobustScaler, and the assumptions each one makes about your data.
California Housing Price Prediction — ML Deep Dive
A hands-on ML learning series covering data loading, EDA, visualization, feature engineering, and stratified sampling — concept by concept.
Spaceship Titanic SQL Case Study
SQL-powered analysis uncovering how CryoSleep dictated passenger outcomes, debunking "planet" and "deck" myths, and revealing one true spatial anomaly.
Contact
Open to full-time opportunities and collaborative projects.