
About Me

I am a Data Engineer with 3+ years of experience building cloud-based data pipelines across healthcare, education, and analytics environments. I specialize in delivering production-ready data workflows, designing compliant data models, and implementing both batch and streaming pipelines using serverless architectures and modern reporting systems.
Throughout my career, I have successfully delivered 10+ production data workflows, working extensively with technologies like Snowflake, Apache Airflow, Apache Kafka, AWS services (S3, Lambda, Glue, CloudWatch), and Python-based ETL/ELT processes. My expertise spans from real-time data ingestion pipelines to AI-ready data pipelines using vector storage and retrieval workflows.
I hold a Master of Science in Software Engineering from Stevens Institute of Technology and am certified in AWS Cloud Practitioner and dbt Fundamentals. My passion lies in architecting scalable data solutions that transform raw data into actionable insights, enabling data-driven decision making across organizations.
Download CVEducation & Experience
Bachelor of Engineering in Information Technology
Datta Meghe College of Engineering affiliated with University of Mumbai
Aug 2019 - Jun 2023
GPA: 8.25/10
Data Analyst
Meta Systems
Oct 2021 - Aug 2023
Dataset analysis, optimized SQL queries, performance reports, ETL pipeline administration
Master of Science in Software Engineering
Stevens Institute of Technology
Sep 2023 - May 2025
GPA: 4.0/4.0
Graduate Student Assistant
Stevens Institute of Technology
May 2024 - May 2025
Academic/operational reports, CRM/content management, data validation
Data Engineer
Curanostics
Jun 2025 - Present
ETL data models, real-time ingestion pipelines, AI-ready data pipelines
Technical Skills
Programming & Scripting
Python
SQL
Pandas
NumPy
SciPy
Data Engineering & ETL
Apache Airflow
Docker
CI/CD
dbt
AWS Glue
Data Warehousing
Snowflake
Azure Synapse Analytics
Azure Data Lake
Big Data & Streaming
Apache Kafka
Apache Spark
Databases
PostgreSQL
FireStore
MongoDB
Pinecone
Visualization & BI
Power BI
Tableau
Matplotlib
Seaborn
Plotly
Alteryx
Analytics & Modeling
Scikit-learn
Regression & Hypothesis Testing
Feature Engineering & Validation
Version Control & Collaboration
Git/GitHub
JIRA/Confluence
Cloud & Platforms
Azure
AWS
AWS Lambda
AWS S3
AWS Bedrock
Portfolio

Fairly - Domestic Worker Rights Chat Application
Rutgers Hackathon winner project - A modern full-stack chat application built with Next.js and FastAPI, featuring AI-powered responses using RAG (Retrieval Augmented Generation) for domestic worker rights information. The system processes government documents and provides accurate, context-aware answers based on legal regulations. Features 3-layer RAG architecture with Google Gemini 2.0 Flash Exp, Pinecone vector search, and jurisdiction-aware responses.
Data Engineering • Gen AI
Next.js, FastAPI, Google Gemini 2.0 Flash Exp, Pinecone, MongoDB, Python, TypeScript, RAG

IMDb Data Engineering & Analytics Project
Complete end-to-end data pipeline and BI dashboard built using IMDb data. Automated ETL pipeline using AWS Lambda (Docker), Alteryx for advanced filtering and joins, and Tableau Cloud for interactive visualizations. Processed IMDb datasets (.tsv) to identify top 1000 movies (rating > 8.0, min 1000 votes), extracted most frequent directors, and created YouTube-themed dashboard with genre analysis, runtime distribution, and rating trends.
Data Analytics • Data Engineering
Python, Pandas, PyArrow, AWS Lambda, Docker, Amazon S3, AWS Glue Data Catalog, Alteryx Designer, Tableau Cloud

YouTube Data Engineering Pipeline
Built a complete end-to-end cloud-native data pipeline using AWS services to process, clean, and visualize YouTube video engagement data. Implemented serverless architecture with Lambda for transformation, Glue for ETL and cataloging, and QuickSight for interactive dashboards. Delivers insights on top-viewed channels, engagement factors, and category-wise performance.
Data Engineering
AWS S3, AWS Lambda, AWS Glue, AWS Athena, Amazon QuickSight, Docker, Python, PyArrow, Pandas

Hack-A-Holiday - Serverless Data & AI Platform
An intelligent travel planning platform that creates personalized itineraries using AWS Bedrock AI. Features ChatGPT-like conversation management, real-time flight/hotel integration via RapidAPI, and comprehensive travel assistance with context-aware responses. Built with Next.js frontend, Express.js backend, and AWS services including DynamoDB, Bedrock Nova models, and Firebase OAuth.
Gen AI
Next.js, Express.js, AWS Bedrock, AWS DynamoDB, Firebase OAuth, RapidAPI, TypeScript, React

Airbnb End-to-End Data Engineering Pipeline
Implemented a complete end-to-end data engineering pipeline using Medallion Architecture (Bronze → Silver → Gold) with Snowflake, dbt, and AWS. Features incremental loading, SCD Type 2 snapshots, custom macros, and analytics-ready datasets for Airbnb listings, bookings, and hosts data.
Data Engineering
Snowflake, dbt, AWS S3, Python, SQL

Melanoma Skin Cancer Detection Using Deep Learning
End-to-end solution for early melanoma detection using deep learning. Built and deployed multiple Convolutional Neural Network (CNN) models with frontend web apps for clinical usability. Processed dermatoscopic images from ISIC Archive with preprocessing (hair removal, augmentation), evaluated custom CNN and ResNet50 architectures, and deployed via Flask-based responsive web interfaces. Includes comprehensive evaluation metrics (accuracy, precision, recall, confusion matrix) and comparative analysis.
Machine Learning/Deep Learning
Python, TensorFlow, Keras, OpenCV, Flask, HTML, CSS, Bootstrap, CNN, ResNet50
Publications & Certifications
AWS Certified Cloud Practitioner
Amazon Web Services
2024
Validated cloud expertise and understanding of AWS services
View Certificate
dbt Fundamentals
dbt Labs
2024
Mastered data transformation and analytics engineering with dbt
View Certificate
Bloomberg Market Concepts
Bloomberg
2023
Comprehensive understanding of financial markets and economic concepts
View Certificate
Comparison of Deep Learning Algorithms for Early Detection of Melanoma
Springer
2024
Research on deep learning algorithms for early melanoma detection using convolutional neural networks
View Publication
A Comparative Study of Melanoma Images Using CNN And ResNet 50
ICIECCA 2023
2023
Presented at the 2nd International Conference on Inventive Electronics, Computing and Communication Applications (ICIECCA 2023). Comparative analysis of CNN and ResNet50 architectures for melanoma image classification.

Stock Price Prediction Using ARIMA Forecasting and LSTM Based Forecasting, Competitive Analysis
IJRASET
2022
Published in International Journal for Research in Applied Science & Engineering Technology (Volume 10, Issue XI, November 2022). Comparative analysis of ARIMA and LSTM models for stock price prediction.

Crime Evidence Over Blockchain
IJSREM
2023
Published in International Journal of Scientific Research in Engineering & Management (Volume 07, Issue 04, April 2023). Research on blockchain technology for secure crime evidence management.
Contact me
Recent Graduate Seeking Opportunities in the US - Let's Connect!