Hey, I am Deep Shelke

Data Analyst / Data Engineer

3+
years of experience
20+
projects
Deep Shelke

About Me

Deep Shelke

I am a Data Engineer with 3+ years of experience building cloud-based data pipelines across healthcare, education, and analytics environments. I specialize in delivering production-ready data workflows, designing compliant data models, and implementing both batch and streaming pipelines using serverless architectures and modern reporting systems.

Throughout my career, I have successfully delivered 10+ production data workflows, working extensively with technologies like Snowflake, Apache Airflow, Apache Kafka, AWS services (S3, Lambda, Glue, CloudWatch), and Python-based ETL/ELT processes. My expertise spans from real-time data ingestion pipelines to AI-ready data pipelines using vector storage and retrieval workflows.

I hold a Master of Science in Software Engineering from Stevens Institute of Technology and am certified in AWS Cloud Practitioner and dbt Fundamentals. My passion lies in architecting scalable data solutions that transform raw data into actionable insights, enabling data-driven decision making across organizations.

Download CV

Education & Experience

Bachelor of Engineering in Information Technology

Datta Meghe College of Engineering affiliated with University of Mumbai

Aug 2019 - Jun 2023

GPA: 8.25/10

Data Analyst

Meta Systems

Oct 2021 - Aug 2023

Dataset analysis, optimized SQL queries, performance reports, ETL pipeline administration

Master of Science in Software Engineering

Stevens Institute of Technology

Sep 2023 - May 2025

GPA: 4.0/4.0

Graduate Student Assistant

Stevens Institute of Technology

May 2024 - May 2025

Academic/operational reports, CRM/content management, data validation

Data Engineer

Curanostics

Jun 2025 - Present

ETL data models, real-time ingestion pipelines, AI-ready data pipelines

Technical Skills

Programming & Scripting

Python

Python

SQL

SQL

Pandas

Pandas

NumPy

NumPy

SciPy

SciPy

Data Engineering & ETL

Apache Airflow

Apache Airflow

Docker

Docker

CI/CD

CI/CD

dbt

dbt

AWS Glue

AWS Glue

Data Warehousing

Snowflake

Snowflake

Azure Synapse Analytics

Azure Synapse Analytics

Azure Data Lake

Azure Data Lake

Big Data & Streaming

Apache Kafka

Apache Kafka

Apache Spark

Apache Spark

Databases

PostgreSQL

PostgreSQL

FireStore

FireStore

MongoDB

MongoDB

Pinecone

Pinecone

Visualization & BI

Power BI

Power BI

Tableau

Tableau

Matplotlib

Matplotlib

Seaborn

Seaborn

Plotly

Plotly

Alteryx

Alteryx

Analytics & Modeling

Scikit-learn

Scikit-learn

Regression & Hypothesis Testing

Regression & Hypothesis Testing

Feature Engineering & Validation

Feature Engineering & Validation

Version Control & Collaboration

Git/GitHub

Git/GitHub

JIRA/Confluence

JIRA/Confluence

Cloud & Platforms

Azure

Azure

AWS

AWS

AWS Lambda

AWS Lambda

AWS S3

AWS S3

AWS Bedrock

AWS Bedrock

Portfolio

Fairly - Domestic Worker Rights Chat Application - Image 1

Fairly - Domestic Worker Rights Chat Application

Rutgers Hackathon winner project - A modern full-stack chat application built with Next.js and FastAPI, featuring AI-powered responses using RAG (Retrieval Augmented Generation) for domestic worker rights information. The system processes government documents and provides accurate, context-aware answers based on legal regulations. Features 3-layer RAG architecture with Google Gemini 2.0 Flash Exp, Pinecone vector search, and jurisdiction-aware responses.

Data Engineering • Gen AI

Next.js, FastAPI, Google Gemini 2.0 Flash Exp, Pinecone, MongoDB, Python, TypeScript, RAG

IMDb Data Engineering & Analytics Project - Image 1

IMDb Data Engineering & Analytics Project

Complete end-to-end data pipeline and BI dashboard built using IMDb data. Automated ETL pipeline using AWS Lambda (Docker), Alteryx for advanced filtering and joins, and Tableau Cloud for interactive visualizations. Processed IMDb datasets (.tsv) to identify top 1000 movies (rating > 8.0, min 1000 votes), extracted most frequent directors, and created YouTube-themed dashboard with genre analysis, runtime distribution, and rating trends.

Data Analytics • Data Engineering

Python, Pandas, PyArrow, AWS Lambda, Docker, Amazon S3, AWS Glue Data Catalog, Alteryx Designer, Tableau Cloud

YouTube Data Engineering Pipeline - Image 1

YouTube Data Engineering Pipeline

Built a complete end-to-end cloud-native data pipeline using AWS services to process, clean, and visualize YouTube video engagement data. Implemented serverless architecture with Lambda for transformation, Glue for ETL and cataloging, and QuickSight for interactive dashboards. Delivers insights on top-viewed channels, engagement factors, and category-wise performance.

Data Engineering

AWS S3, AWS Lambda, AWS Glue, AWS Athena, Amazon QuickSight, Docker, Python, PyArrow, Pandas

Hack-A-Holiday - Serverless Data & AI Platform - Image 1

Hack-A-Holiday - Serverless Data & AI Platform

An intelligent travel planning platform that creates personalized itineraries using AWS Bedrock AI. Features ChatGPT-like conversation management, real-time flight/hotel integration via RapidAPI, and comprehensive travel assistance with context-aware responses. Built with Next.js frontend, Express.js backend, and AWS services including DynamoDB, Bedrock Nova models, and Firebase OAuth.

Gen AI

Next.js, Express.js, AWS Bedrock, AWS DynamoDB, Firebase OAuth, RapidAPI, TypeScript, React

Airbnb End-to-End Data Engineering Pipeline - Image 1

Airbnb End-to-End Data Engineering Pipeline

Implemented a complete end-to-end data engineering pipeline using Medallion Architecture (Bronze → Silver → Gold) with Snowflake, dbt, and AWS. Features incremental loading, SCD Type 2 snapshots, custom macros, and analytics-ready datasets for Airbnb listings, bookings, and hosts data.

Data Engineering

Snowflake, dbt, AWS S3, Python, SQL

Melanoma Skin Cancer Detection Using Deep Learning - Image 1

Melanoma Skin Cancer Detection Using Deep Learning

End-to-end solution for early melanoma detection using deep learning. Built and deployed multiple Convolutional Neural Network (CNN) models with frontend web apps for clinical usability. Processed dermatoscopic images from ISIC Archive with preprocessing (hair removal, augmentation), evaluated custom CNN and ResNet50 architectures, and deployed via Flask-based responsive web interfaces. Includes comprehensive evaluation metrics (accuracy, precision, recall, confusion matrix) and comparative analysis.

Machine Learning/Deep Learning

Python, TensorFlow, Keras, OpenCV, Flask, HTML, CSS, Bootstrap, CNN, ResNet50

Publications & Certifications

Amazon Web Services
Certification

AWS Certified Cloud Practitioner

Amazon Web Services

2024

Validated cloud expertise and understanding of AWS services

View Certificate
dbt Labs
Certification

dbt Fundamentals

dbt Labs

2024

Mastered data transformation and analytics engineering with dbt

View Certificate
Bloomberg
Certification

Bloomberg Market Concepts

Bloomberg

2023

Comprehensive understanding of financial markets and economic concepts

View Certificate
Springer
Publication

Comparison of Deep Learning Algorithms for Early Detection of Melanoma

Springer

2024

Research on deep learning algorithms for early melanoma detection using convolutional neural networks

View Publication
ICIECCA 2023
Publication

A Comparative Study of Melanoma Images Using CNN And ResNet 50

ICIECCA 2023

2023

Presented at the 2nd International Conference on Inventive Electronics, Computing and Communication Applications (ICIECCA 2023). Comparative analysis of CNN and ResNet50 architectures for melanoma image classification.

IJRASET
Publication

Stock Price Prediction Using ARIMA Forecasting and LSTM Based Forecasting, Competitive Analysis

IJRASET

2022

Published in International Journal for Research in Applied Science & Engineering Technology (Volume 10, Issue XI, November 2022). Comparative analysis of ARIMA and LSTM models for stock price prediction.

IJSREM
Publication

Crime Evidence Over Blockchain

IJSREM

2023

Published in International Journal of Scientific Research in Engineering & Management (Volume 07, Issue 04, April 2023). Research on blockchain technology for secure crime evidence management.

Contact me

Recent Graduate Seeking Opportunities in the US - Let's Connect!