Mayank Kantharia

Data Scientist & ML Engineer

Building intelligent systems with machine learning, NLP, and cloud engineering.

About Me

A quick look at my journey and what I love building.

I’m a Data Scientist & ML Engineer based in Melbourne, focused on building intelligent, end to end data and machine learning solutions. I recently completed my Master of Data Science at RMIT University, where I specialised in machine learning, NLP, cloud systems, and applied analytics.

My background in Computer Engineering gives me a strong foundation in programming, backend development, and database systems. This helps me approach problems with both analytical depth and engineering discipline.

I’ve worked on projects involving RAG systems, NLP pipelines, cloud deployed applications, deep learning models, and large scale data analysis. I enjoy designing solutions that move beyond experimentation into real world impact, whether it’s deploying ML services, analysing complex datasets, or building scalable cloud architectures.

I’m driven by curiosity, continuous learning, and the challenge of turning raw data into meaningful intelligence. My goal is to build systems that are accurate, reliable, scalable, and genuinely useful.

My Technical Arsenal

Technologies I use to design, deploy, and scale machine learning systems.

Machine Learning & AI

PythonTensorFlowPyTorchScikit-learnPandasNumPyMatplotlibSeaborn

NLP & Information Retrieval

TransformersFastTextOpenAI APIpgvector

Data Engineering & Cloud

AWSEC2S3LambdaAPI GatewayDockerSupabase

Databases

PostgreSQLMySQLMongoDB

Backend & APIs

FastAPIFlaskDjangoREST APIs

Tools & Platforms

GitGitHubJupyterVS Code

Featured Internships

Industry experiences where I owned high-impact deliverables across data, ML, and analytics teams.

TryPath internship deliverables

Data Scientist Intern

Jul-Nov 2025TryPathAI Development

Designed and deployed a RAG-based AI career guidance system using OpenAI GPT, Supabase pgvector, and FastAPI. Built retrieval pipelines, ranking logic, and calendar‑aware scheduling workflows to deliver personalised role recommendations and timeline‑driven roadmaps.

RAGFastAPIOpenAISupabasepgvectorAdzuna APINext.js
Live Website
Pratishtha internship deliverables

Backend Developer Intern

Jun-Nov 2022PratishthaFestival Tech

Developed a responsive website and mobile app for a college fest using Flutter and Firebase. Implemented backend logic, real‑time updates, and cross‑device compatibility to enhance event engagement and user experience.

FlutterFirebaseHTMLCSSJavaScriptPHPBootstrap
Live Website
HireBus internship deliverables

Flutter Developer Intern

Nov 2021 - Apr 2022HireBusProduct Launch

Led UI/UX development and API integration for a mobile application built with Flutter and Firebase. Delivered smooth navigation, clean UI components, and scalable backend connectivity to support early‑stage product rollout.

FlutterFirebaseUI/UXGitHub
Live Website

Featured Projects

Resume-aligned project portfolio across cloud engineering, NLP, network analytics, and deep learning research.

Interactive Economic Insights Dashboard

Data Visualization & AnalyticsRMIT

Delivered a Shiny-based analytics experience that turns raw economic data into navigable, stakeholder-friendly insights.

  • Built an interactive dashboard analyzing global economic trends with drill-down regional and sector exploration
  • Designed clear, accessible visualizations for non-technical stakeholders
  • Structured and transformed raw datasets for efficient, user-driven reporting and analysis
  • Designed geospatial maps and time-series visualizations with drill-down country profiles
  • Cleaned, standardized, and merged multi-source datasets to ensure consistent indicators
RShinyggplot2dplyr

Cloud-Based Music Subscription Web Application

Cloud EngineeringRMIT

Deployed a full-stack music subscription platform on AWS, combining compute, storage, serverless APIs, and NoSQL data design.

  • Deployed an EC2 Ubuntu server with Apache2 and public DNS access
  • Designed DynamoDB schemas for login, music, and subscription data
  • Built backend logic for registration, subscription management, and music queries
  • Integrated API Gateway and Lambda for REST-based workflows
  • Automated artist image retrieval and secure storage in S3
AWS EC2S3DynamoDBAPI GatewayLambdaUbuntuApache2
Private Repo Resume Project
Network graph visual for Russia-Ukraine information flows

Russia-Ukraine War: Social Media & Network Analysis

NLP & Network AnalyticsRMIT

Analyzed cross-platform discourse on Reddit and YouTube using NLP, topic modeling, and network science to study narrative spread.

  • Collected and preprocessed 90,000+ Reddit and YouTube comments
  • Built an NLP pipeline with VADER sentiment analysis and LDA topic modeling
  • Detected platform-level divergence (YouTube Q=0.895, 600 communities vs Reddit Q=0.571, 237 communities)
  • Modeled information diffusion with the Independent Cascade algorithm
  • Identified centralized YouTube hubs amplifying leader-focused narratives
PythonPRAWYouTube APIVADERLDANetworkXLouvain

NLP Clothing Review Classification & Web Application

NLP + Applied MLRMIT

Built and deployed NLP classification pipelines and a Flask web app for real-time review-based recommendations.

  • Engineered five feature representations (BoW, TF-IDF, embeddings, title-only, combined)
  • Title+Review features achieved 88.97% mean accuracy across 5-fold CV
  • Benchmarked Logistic Regression, FastText, and deep learning models
  • Combined features outperformed single-input baselines by up to 2.3%
  • Deployed the best model in a Flask app with prediction, search, and review submission
TensorFlowFastTextTF-IDFLogistic RegressionFlaskPython

Few-Shot Hyperspectral Image Classification

Deep Learning ResearchIIT Bombay (Final Year Project)

Advanced few-shot hyperspectral image classification using attention and temperature scaling to improve generalization.

  • Beat published benchmarks with 99.55% overall accuracy vs 98.16% baseline
  • Validated attention as the key driver via ablation studies
  • Temperature scaling alone underperformed the baseline at 97.21%
  • Performed spectral-spatial preprocessing and feature extraction
  • Evaluated across multiple HSI benchmarks for generalization
PyTorchTensorFlowFew-Shot LearningAttentionHSIDeep Learning

Telco Customer Churn Prediction Analysis

ML + Business AnalyticsPersonal Project

Built an end-to-end churn prediction pipeline on IBM Telco data with modeling insights tied to retention actions.

  • Built an end-to-end churn pipeline on IBM Telco data (7,043 rows, 21 features) to surface key churn drivers
  • Handled class imbalance with stratified sampling (5,000 records) and stratified CV
  • Set up preprocessing with scaling, one-hot encoding, and ANOVA F-test feature selection
  • Reached 0.836 ROC-AUC with Logistic Regression across five ML models and one deep learning model
  • Produced a reproducible analysis with separate EDA and modeling notebooks plus model comparison tables
PythonpandasNumPyscikit-learnTensorFlowMatplotlibSeabornJupyter
Private Repo Resume Project

Get In Touch

Let's Connect

Data Science graduate with a focus on ML, NLP, and data engineering. Let’s connect and build something meaningful.

Follow Me