Ensemble classifier trained on 775K+ OFLC employer applications to predict US work visa certification outcomes. Tuned Random Forest achieved F1=0.82 with 87% recall — wage-to-prevailing-rate ratio surfaced as the dominant approval signal.
Pythonscikit-learnRandom ForestXGBoostGBM
Predicted quarterly revenue for a multi-city supermarket chain using tree-based ensemble regressors. Tuned XGBoost delivered R²=0.90 and MAPE=6.45%, enabling reliable inventory planning and data-driven procurement decisions at scale.
PythonXGBoostscikit-learnPandasSeaborn
Classified which bank customers are likely to purchase personal loans, optimizing for recall to catch every potential buyer. Pre-pruned decision tree hit AUC=0.97 with perfect recall — income and monthly credit spend were the strongest predictors.
Pythonscikit-learnDecision TreePandas
RAG pipeline over the Merck Medical Manuals to assist clinicians with diagnostics, drug lookups, treatment protocols, and critical care queries. LangChain + ChromaDB retrieval grounds a local Llama LLM in vetted medical sources, reducing hallucination risk in high-stakes scenarios.
PythonLangChainChromaDBLlamaHuggingFace
Transfer learning pipeline (VGG-16 + FFNN) to detect helmet compliance from construction site camera images. Trained on 631 images across varied lighting and angles with data augmentation — achieved 100% test accuracy and a perfect confusion matrix.
PythonTensorFlowVGG-16KerasOpenCV
Exploratory analysis of a food aggregator's order dataset to surface demand patterns by cuisine type, delivery timing, and customer ratings. Identified a 5.87-minute weekday delivery lag and proposed targeted promotions to close the gap and drive retention.
PythonPandasMatplotlibSeaborn