Harsh Gupta Photo

Harsh Gupta

Data & AI Engineer @ Indiana University Bloomington

Building scalable data systems and applied AI products.

3+ years of experience across Data & AI engineering, data analysis and ML.
Ex-Deloitte · Ex-Samsung Research Intern

Open to full-time roles in Data Science & Engineering, Analytics, Machine Learning, or AI starting May 2026.

Download Resume

About Me

I'm Harsh Gupta, with over 3 years of experience in AI and Data roles across diverse industries including education, research, retail, and insurance. I’ve contributed to projects at Deloitte and Samsung Research, working on everything from data engineering and analysis to computer vision and intelligent automation.

Currently, I'm pursuing a Master’s in Data Science at Indiana University Bloomington while working part-time as an Data & AI Engineer. I completed my undergraduate studies at Manipal Institute of Technology with a major in Electronics & Communication Engineering and a Minor in Data Science.

What drives me is building systems that don’t just work, but learn, adapt, and scale. I love automating repetitive tasks, designing LLM-powered assistants, and creating tools that support better decision-making. Whether it’s optimizing retail pricing, enhancing educational platforms, or accelerating developer workflows, I enjoy applying AI to create meaningful impact.
Let’s connect — I’d love to hear what you’re working on!

What I Do: The Full Stack Flow

From ingestion to intelligence : building systems end to end.

Layer 0

Ingestion & ETL

Designing reliable pipelines to ingest, clean, and unify data from multiple structured and unstructured sources.

Layer 1

Processing & Analytics

Transforming raw data into meaningful features, metrics, and insights that drive downstream modeling.

Layer 2

Modeling & Intelligence

Training machine learning models to predict, classify, and forecast real-world outcomes.

Layer 3

GenAI & Agentic Systems

Building LLM-powered applications, RAG pipelines, and agents that reason, act, and automate workflows.

Skills

Programming Foundations

Python SQL R

LLM & AI Systems

RAG Hybrid Search (BM25 + Vector) Embeddings Cross-Encoder Reranking Context-Aware Chunking LangChain LangGraph Semantic Kernel Azure AI Search MCP Ollama Hugging Face

Data Engineering & Platforms

Apache Spark PySpark ETL / ELT Pipelines Apache Airflow Azure Data Factory AWS Glue Data Modeling (Fact / Dim) Data Validation & Logging Feature Engineering

Machine Learning & MLOps

Model Deployment Docker CI / CD Regression Clustering Time Series Forecasting Recommender Systems Computer Vision Deep Learning TensorFlow PyTorch

Cloud, Storage & Analytics

AWS (S3, EC2, Redshift) Azure (Azure ML) Postgres Amazon Aurora OracleDB Hadoop Tableau Power BI

Backend & APIs

FastAPI Flask REST APIs Git / GitHub

Education

Indiana University Bloomington, USA

MS in Data Science (Aug 2024 – May 2026)

GPA: 4/4

Manipal Institute of Technology, India

B.Tech in Electronics & Communication Engineering

Minor in Data Science (Jul 2017 – May 2021)

Experience

Indiana University Bloomington

Part-Time ML Engineer (Oct 2024 – Present)

Indiana University Campus Auxiliaries

AI Engineer Intern (Jun 2025 – Aug 2025)

Deloitte

Data Engineer & Scientist (Sep 2021 – Jul 2024)

Samsung Research

Computer Vision Research Intern (Jan 2021 – Jun 2021)

Projects & Publications

Publications

Star identification in night sky images using mobile phone camera

Certificate 3 View Paper

Cross-Geography Generalization of Machine Learning Methods for Classification of Flooded Regions in Aerial Images

Certificate 3 View Paper
AI / ML

TA-Lite

Certificate 3

A modular, instructor-aligned teaching assistant powered by LLMs.

View on GitHub View Demo

MockChain

Certificate 3

Multi-agent AI-powered mock interview platform with personalized feedback.

View on GitHub View Demo

DataLens

LLM-powered RAG search and Q&A for structured and unstructured data. Internal Deloitte Project

View Certificate

Online Sign Recognition

Certificate 3

Time-series handwriting recognition and fraud detection.

View Report

Music Emotion Recognition

Certificate 3

Classify music into emotions using ML.

View Report

Flight Price Prediction

Collected data and trained model to predict flight prices.

View on GitHub
Data Science, Engineering & Visualization

Retail Sales Price Optimization

Certificate 3

Used price elasticity modeling to recommend optimal pricing in retail.

View on GitHub

Retail Store Pricing Dashboard

Certificate 3

Dashboard for visualizing product and city trends in retail pricing.

View Dashboard View on GitHub

CodETL

ETL engine with rule-based code generation. Cut dev time by 40%. (Deloitte Internal Project)

Analyzing & Visualizing Google Analytics Data for HRA UI

Certificate 3

Built an interactive dashboard using Google Analytics data to visualize the user journey. It mapped how users interacted with the interface, highlighted the most used features, and tracked adoption of new functionalities. This helped identify which features were driving engagement and which needed improvement.

View Dashboard

Covid Dashboard

Certificate 3

Dashboard that scraped and visualized India's COVID-19 stats.

View on GitHub
Computer Vision

Drowsy Driver Assistant

Certificate 3

Created a computer vision–based system to detect driver drowsiness using in-car cameras. The system issued real-time alerts, and in severe cases, initiated autonomous vehicle control to safely guide the car to the shoulder and halt, while sending an emergency alert.

View Poster View Demo

Astrophotography

Certificate 3

Collaborated with Samsung R&D to develop a system that enhances night sky images captured on mobile devices. Improved signal-to-noise ratio to deliver clearer visuals and more accurate star detection.

View Paper

Estimation of Flooded Regions

Certificate 3

Cross-region flood segmentation using UAV aerial imagery.

View Paper

Virtual Mouse using Python-OpenCV

Certificate 3

Gesture-based virtual mouse using OpenCV hand tracking.

View on GitHub

Image to Text using CNN

Certificate 3

Handwritten text to digital using OpenCV and CNN prediction.

View on GitHub
Miscellaneous

Streamlit Youtube Playlist

Certificate 3

Create a series of video explaining how to use streamlit. Gathered over 210k views and 13.5k hrs of watch time

View on Youtube

Sneaker Update Bot

Created a real-time Discord bot that alerted users about sneaker drops and restocks. It supported a client’s resale business, generating $300K in sales across 1,000+ pairs.

View on GitHub

Covid Vaccine Appointment Bot

Alerts for vaccine slots by scraping using zip codes.

Honors & Certifications

Certifications

Databricks Fundamentals

Certificate 1

IBM Professional Certification

Certificate 1

Deloitte AI Academy

Certificate 2

APEX Certificate

Certificate 4

Tableau

Certificate 5
Honors

Samsung Excellence Award

Certificate 3

Deloitte Outstanding Award

Certificate 3

Deloitte Applause Award - QSR Client

Certificate 2

Deloitte Applause Award - Insurance Client

Certificate 1

Deloitte Applause Award - Hackathon

Certificate 3

APEX Training - Best Presentation

Certificate 3
Feel free to reach out via email or connect on LinkedIn.