Koshik Debanath

Machine Learning & AI Research | Computer Vision | NLP

Rajshahi University of Engineering & Technology

koshik.debanath@gmail.com

Research Interests

Machine Learning, Deep Learning, Computer Vision, Natural Language Processing, Medical Image Analysis, Low-Resource Language Processing, Generative AI

Koshik Debanath

About

Research background and academic profile

Education

B.Sc. in Computer Science and Engineering

Rajshahi University of Engineering & Technology

CGPA: 3.27/4.00 (2018-2023)

Research Focus

• Machine Learning & Deep Learning

• Computer Vision & Medical Imaging

• Natural Language Processing

• Low-Resource Language Processing

Background

I am a software engineer and researcher with expertise in machine learning, computer vision, and natural language processing. My work focuses on developing practical solutions for real-world problems, with experience in medical imaging analysis, low-resource language processing, and deep learning applications. I have contributed to multiple peer-reviewed publications and open-source projects in the field of artificial intelligence.

Technical Expertise

Programming Languages

Python (Expert), C/C++, Java, JavaScript, SQL, MATLAB

Machine Learning Frameworks

PyTorch, TensorFlow, Keras, Scikit-learn, LangChain, Transformers, OpenCV

Research Areas

Generative AI (LLMs, RAG, Fine-tuning), NLP, Computer Vision, Deep Learning, Bayesian Methods, Scientific ML, Explainable AI

Tools & Platforms

Git, Docker, FastAPI, Flask, Django, MLOps, Pinecone, MongoDB, MySQL

Publications

Peer-reviewed research in machine learning, computer vision, and natural language processing

Journal Articles


Under Review July 2025
Bayesian Physics-Informed Neural Networks for Parameter Inference and Uncertainty Quantification in Reaction-Diffusion Models of Wound Healing

Authors: K. Debanath, S. Aich, and A.Y Srizon

Abstract-Predictive mathematical models of biological processes like wound healing are essential for quantitative understanding, but their clinical utility is often limited by a critical roadblock: uncertainty in their biophysical parameters. These parameters are difficult to measure directly and must be inferred from sparse, noisy data. This paper presents a Bayesian Physics-Informed Neural Network (BPINN) framework to address this challenge by performing robust parameter inference and principled uncertainty quantification. We frame the identification of unknown parameters in a coupled reaction-diffusion system for wound healing as a Bayesian inverse problem. By integrating sparse observational data with the governing physical laws within a variational inference framework, the BPINN learns the full posterior distributions of unknown model parameters. Our results show that the framework accurately infers key reaction parameters from a dataset comprising less than 0.01% of the full spatio-temporal domain. More importantly, the BPINN correctly diagnoses that the cell motility parameter is practically non-identifiable from the sparse data, a conclusion supported by the large posterior uncertainty it assigns. The model’s predictive uncertainty is well-calibrated, being highest in regions far from observations. This work establishes the dual value of BPINNs as a powerful computational tool: both for developing reliable, personalized biomechanical models through data-driven calibration, and for diagnosing parameter identifiability issues—a critical step towards building trustworthy models in computational medicine and systems biology.

Bayesian Physics-Informed Neural Networks Reaction-Diffusion Systems Scientific Machine Learning
Under Review 2026
Optimizing Semantic Retrieval for Bengali: A Comparative Analysis of Monolingual and Multilingual Embeddings with Matryoshka Representation Learning

Authors: K. Debanath and A.Y. Srizon

Submitted to: Transactions on Asian and Low-Resource Language Information Processing (TALLIP)

Abstract—Effective semantic retrieval remains the primary bottleneck for Bengali Retrieval-Augmented Generation (RAG) systems. While general-purpose multilingual models exist, they often lack the semantic alignment required for high-precision tasks. This paper presents a comparative study of three embedding architectures—monolingual (shihab17), distilled multilingual (distiluse), and paraphrase-focused (mpnet)—to identify the strongest retrieval foundation for Bengali. The models are fine-tuned on the BanglaRQA dataset with a composite objective combining Multiple Negatives Ranking Loss and Matryoshka Representation Learning (MRL). Results show that architectural choice is decisive: the fine-tuned mpnet model achieves an NDCG@10 of 0.8114 and statistically outperforms the monolingual baseline (p < 0.001). Beyond retrieval quality, efficiency analysis for low-resource deployment shows that MRL-trained MPNet preserves 96% of retrieval effectiveness at 128 dimensions, reducing storage cost by 83% without meaningful accuracy loss.

CCS Concepts: Computing methodologies → Natural language processing; Neural networks. Information systems → Retrieval models and ranking.

Bengali NLP Semantic Retrieval Monolingual vs Multilingual Embeddings Matryoshka Representation Learning Sentence Embeddings RAG

Conference Publications


Knee Injury Architecture Model Comparison
Architecture Diagram
IEEE ICCIT 2023
Published: December 2023
An Attention-Based Deep Learning Approach to Knee Injury Classification from MRI Images

K. Debanath, A.F.M.M. Rahman, and M.A. Hossain

2023 26th International Conference on Computer and Information Technology (ICCIT), Cox's Bazar, Bangladesh, pp. 1-6

Abstract: Knee injuries, prevalent in athletic and aging populations, pose significant challenges to healthcare professionals due to their complex nature and the critical function of the knee joint. Early and accurate diagnosis is paramount to ensure effective treatment and minimize long-term complications. Traditional diagnostic methods, including physical examinations and imaging techniques like MRI, require expert interpretation and can sometimes be inconclusive. This study introduces an approach to knee injury classification using deep learning techniques by leveraging convolutional neural networks (CNNs) with Attention Mechanism. This research work integrates powerful feature extraction capabilities of CNN and feature refinement of attention mechanism for the binary and multi-class classification of knee MRI images, with the aim of accurately identifying specific knee injury types. Based on our experiment on two comprehensive knee MRI datasets, our custom CNN model achieved 88% testing accuracy on Dataset-1 (Binary classification) and 77% accuracy on Dataset-2 (Multi-class classification). Meanwhile, the Attention-based CNN model achieved 100% accuracy on Dataset-1 (Binary Classification) and 91% accuracy on Dataset-2 (Multi-Class Classification). This approach not only holds promise for enhancing diagnostic accuracy but also for reducing the time to diagnosis.

Computer Vision Medical Imaging Deep Learning Attention Mechanism CNN
Bangla-LLama Architecture Transformer Architecture
ECCE 2025 2025
Advancing Low-Resource NLP: Contextual Question Answering for Bengali Language Using Llama

Authors: K. Debanath, S. Aich, and A.Y Srizon

Abstract—Natural language processing (NLP) has witnessed significant advancements in recent years, particularly in improving question-answering (QA) systems for well-resourced languages such as English. However, the development of such systems for low-resource languages, including Bengali, remains insufficiently explored. This study proposes an approach to developing a Bengali QA system utilizing the Llama-3.2-3B-Instruct model, leveraging transfer learning techniques on a synthetic dataset derived from the SQuAD 2.0 benchmark...

NLP Bengali Language Question Answering Large Language Models Llama Model Fine-tuning
Conference Presentation
Presentation Slides
Download PPTX

Conference presentation slides

Presented at ECCE 2025 Conference Video & Slides Available

Bengali Word Cloud Confusion Matrix
ECCE 2025 2025
Distinguishing Between Formal and Colloquial: A Multilingual BERT Approach to Bengali Language Classification

Authors: S. Aich, K. Debanath, and A.Y Srizon

Abstract—The Bengali language, rich in history and cultural significance, poses unique challenges in Natural Language Processing (NLP) due to its dual-register structure: Sadhu (formal) and Cholit (colloquial). These registers differ significantly in syntax, vocabulary, and usage, complicating tasks such as text classification, translation, and sentiment analysis...

Multilingual BERT Text Classification Bengali NLP Fine-tuning Sadhu and Cholit
Conference Presentation
ECCE 2025 Presentation Slides

Multilingual BERT approach for Bengali language classification

Bot Activity Analysis
NCIM-2025 2025
Analyzing Bot Activity and Political Discourse in the 2024 U.S. Presidential Election: A Machine Learning Approach to Misinformation and Manipulation

Authors: K. Debanath, S. Aich, and A.Y Srizon

Abstract—Social media has become a battleground for political discourse, with automated accounts (bots) playing a growing role in shaping public opinion and engagement. In the context of the 2024 U.S. Presidential Election, understanding bot activity is crucial for identifying potential misinformation and manipulation tactics...

Social Media Analysis 2024 U.S. Presidential Election Machine Learning Political Discourse
Conference Presentation
NCIM 2025 Presentation Slides

Bot activity analysis in 2024 U.S. Presidential Election

NCIM-2025 2025
Distinguishing Human-Written and AI-Generated Text: A Comprehensive Study Using Explainable Artificial Intelligence in Text Classification

Authors: S. Aich, K. Debanath, and A.Y Srizon

Abstract—Enhancing interpretability without compromising accuracy is a critical challenge in text classification. This re- search explores the integration of Explainable Artificial In- telligence (XAI) techniques with advanced machine learning models, utilizing the Local Interpretable Model-Agnostic Expla- nations (LIME) framework to provide transparency. A fine-tuned BERT model achieved state-of-the-art performance, surpassing Random Forest and Sentence Embedding-based models with a perfect 100% accuracy (ROC-AUC score of 1.00). While Random Forest classifiers offered a solid baseline, they struggled with semantic nuances, underscoring the need for embedding- based approaches. The study highlights the inherent trade-off between interpretability and accuracy, demonstrating that while transformer-based models like BERT excel at capturing complex linguistic patterns, their ”black-box” nature necessitates tools like LIME for explainability. By bridging this gap, the research contributes to the development of more transparent, reliable, and high-performing AI systems.

Explainable AI LIME Random Forest Classifier BERT Semantic Text Embeddings
Conference Presentation
NCIM 2025 Presentation Slides

Explainable AI for Human vs AI text classification

Research Visualization
LIME Output Illustrating the Explanation of a Human-Generated Text.
LIME Output Illustrating the Explanation of a AI-Generated Text.
Additional Conference Paper
Physics-Informed Neural Network Architecture Model Performance Results
Springer LNNS (BIM 2025) Published: February 2026
Physics-Informed Neural Networks for Real-Time Anomaly Detection in Power System Dynamics

Authors: K. Debanath, S. Aich, and A.Y. Srizon

Proceedings of the 3rd International Conference on Big Data, IoT and Machine Learning (BIM 2025), Lecture Notes in Networks and Systems, vol. 1798, pp. 447–457.

Abstract—The stability of modern power grids is increasingly challenged by dynamic disturbances, while conventional data-driven models often require extensive labeled fault data and offer limited interpretability. This work proposes a Physics-Informed Neural Network (PINN) framework for real-time anomaly detection by embedding the swing equation directly into the model loss. To address oscillatory training instability, the method integrates Fourier Feature Mapping with loss weight annealing. Trained only on sparse normal-operation data, the model reconstructs system states accurately and uses the physics residual as an interpretable anomaly signal. Experiments on a Single-Machine Infinite Bus system show instantaneous and precise fault detection, with clearer and more interpretable signals than an LSTM baseline.

Physics-Informed Neural Networks Anomaly Detection Power Systems Real-Time Monitoring Deep Learning
Conference Presentation
Presentation Slides
Download PPTX

Conference presentation slides

Presented at BIM 2025 Conference Video & Slides Available

Research Areas

Computer Vision

Medical image analysis, object detection, image classification, and attention mechanisms for visual understanding.

Natural Language Processing

Low-resource language processing, multilingual models, and language classification for Bengali and other languages.

Machine Learning

Deep learning, neural networks, and AI applications in healthcare, finance, and sports analytics.

Research Analytics

Multi-dimensional analysis of research domains and impact

Research Domains Analysis

Comprehensive overview showing the distribution of research across different domains, including publications, citations, and impact scores.

Research Domains Radar Chart

Professional Experience

Software Engineer I
Universal Machine Inc. (Sunnyvale, CA, USA - Remote)

April 2025 - Present

  • YouTube Live Stream Bot: Developed Chrome Extension automating YouTube Live chat using JavaScript, Chrome APIs, and async requests. Integrated YouTube & OpenAI APIs for real-time chat fetching/posting and AI response generation. Engineered AI features managing conversational history and prompt engineering for context/recall. Implemented secure Google OAuth and robust error handling for external APIs.
  • cBORG DAO Governance Platform: Built full-stack decentralized governance platform using React/Next.js, FastAPI, PostgreSQL, and Ethereum smart contracts for community proposal voting and treasury management. Integrated OpenAI GPT-4o to automatically parse natural language chat messages into structured trading proposals with confidence scoring and real-time voting. Implemented SIWE wallet linking with nonce-based authentication, JWT tokens, and privacy-preserving user identity management.
Data Scientist
Manaknightdigital Inc. (Toronto, ON, Canada - Remote)

March 2023 - April 2025

  • Chatbot Development: Collected and processed product information using Excel, pandas, and openpyxl. Integrated GPT-4 to respond to user queries and manage token size limitations. Utilized libraries like nltk, sklearn, and Flask for deploying the chatbot.
  • AI-driven Fraud Detection: Performed EDA and feature extraction on transaction datasets. Developed and optimized ML models including Xgboost, SVC, and Logistic Regression. Achieved 90% accuracy in detecting fraudulent transactions and deployed the system using Flask.
  • Data-driven ChatBot for Financial Queries: Implemented RAG and Pinecone, enhancing data retrieval speed by 40%, enabling faster decision-making for lenders. Improved data retrieval accuracy by 25% using Cohere reranking, resulting in more precise financial advice. Applied Beautiful Soup and PyPDF2 for data scraping and processing.
  • Sports Data Analysis ChatBot: Scraped and analyzed football data to predict match outcomes. Integrated RAG and Pinecone for efficient data querying and vector database management. Employed Beautiful Soup and PyPDF2 for data collection, analyzing 2 million football data points to achieve a 90% prediction accuracy, supporting strategic betting decisions.
  • Custom Image Generation System: Developed an image generation platform using Stable Diffusion. Fine-tuned custom models to generate images based on user-defined presets. Utilized PyTorch and transformers for model training and deployment and finally used Docker for containerization.
  • AI-driven Data Matching System: Organizational data was segmented using models such as Llama-2-7B and then fine-tuned to extract sections and subsections. Applied cosine similarity for matching data to specific tenders. Integrated GPT-4 for generating rationale from corresponding data. Matched organizational data against specific tenders, increasing successful tender submissions by 70%.
  • AI-Powered Collectible Authentication & Appraisal Platform: Trained deep learning models (PyTorch/TensorFlow, e.g., InceptionV3, ResNet50, CLIP) for image classification (authenticity) and similarity search. Engineered an efficient CLIP+FAISS image similarity system for large-scale appraisal lookups. Developed Flask/FastAPI APIs to serve model predictions (classification, similarity, appraisal). Designed a multi-modal tag identification system using Serverless (RunPod API), TF-IDF, and CLIP/FAISS similarity. Implemented asynchronous data pipelines for large-scale image and metadata ingestion from APIs. Developed a Streamlit web application for user image uploads and displaying similarity/appraisal results via API calls.

Research Projects

AI Investment Committee
AI Investment Committee for Binance

Multi-agent AI system with specialized agents for cryptocurrency investment recommendations.

Stock Price Forecasting
Stock Price Forecasting

LSTM models for stock price prediction in Bangladeshi and global markets.

AI vs Human Text Detector
AI vs Human Text Detector

Interactive web application to classify human-written vs AI-generated text.

DataSciencePilot RAG System
DataSciencePilot (RAG System)

Chat-based interface for querying custom PDFs using Pinecone and LLaMA-2.

CVAnalyzerPro
CVAnalyzerPro

AI tool for automatically scoring candidate CVs against job requirements.

UberRidePrediction
UberRidePrediction

XGBoost model packaged as Python module for Uber fare prediction.

Pinecone Integration Suite
Pinecone Integration Suite

Authored and published two Python libraries to simplify data handling for RAG systems.

CaptionCraft
CaptionCraft

Web application to generate image captions using the Google Gemini Pro Vision API.

Market Price Prediction
Market Price Prediction

Implemented and compared multiple time-series models to predict product prices.

Movie Recommendation
Movie Recommendation

Implemented a KNN model using cosine similarity to recommend movies based on user input.

Potato Disease Classification
Potato Disease Classification

Built a CNN model achieving near-100% accuracy in classifying potato diseases from images.

Diabetes Prediction
Diabetes Prediction

Constructed an Artificial Neural Network with PyTorch to predict patient diabetes status.

Educational Tools & Interactive Demos

Interactive visualizations and educational tools for understanding complex concepts

SynthDetect Ultra Forensic Demo
SynthDetect Ultra: AI Image Forensics

Advanced forensic tool to distinguish authentic photographs from AI-generated images (GANs, Diffusion). Uses explainable, physics-inspired and statistical features to detect synthetic "fingerprints" invisible to the naked eye.

  • Physics & Optics: Chromatic Aberration, CFA correlation
  • Frequency Domain: Spectral Slope, High-Frequency Ratios
  • Statistical: Benford’s Law, Noise Residual Kurtosis
  • Pixel-Level: Error Level Analysis, Gradient Field anisotropy
  • Explainable radar charts & heatmaps
Forensics Explainable AI Image Analysis Python Streamlit
Interactive Convolution Visualizer Animation
Interactive Convolution Visualizer

An interactive web-based tool to visualize and understand convolution operations in deep learning and image processing.

  • Interactive convolution visualization
  • Real-time parameter adjustment
  • Animated processing steps
  • Educational tooltips
Deep Learning Visualization JavaScript
Attention Visualization Between Sentences
Attention Visualization Tool

Interactive tool to visualize attention mechanisms between sentences using various attention types and vector similarity measures.

  • Sentence-level attention heatmaps
  • Multiple attention types (Cosine, Dot Product)
  • Interactive matrix exploration
  • Automated insights & analysis
Attention Mechanism NLP Transformers JavaScript
AI Companion Plugin Demo
AI Companion Plugin

An AI assistant plugin for Obsidian that allows you to ask questions, get responses, and include page content as context.

  • Quick access with `/ai` command
  • Context-aware responses
  • Insert into notes
  • OpenAI integration
AI Assistant Obsidian OpenAI TypeScript

Competitions & Achievements

Hackathon Champion at Machine Hack: Global Ranking 539 out of 8,861
Data Science Student Championship: Secured 7th position among 1,029 participants
LLM Hackathon: Ranked 5th out of 227 participants
Rental Bikes Volume Prediction: Ranked 3rd
House Prices Prediction: Ranked 24th out of 2,885 with 87% accuracy
Subscriber Prediction Talent Search: Ranked 26th out of 5,045 participants
Analytics Olympiad 2022: Ranked 82nd out of 1,029 participants
Data Science Student Championship - South Zone: Ranked 73rd out of 554 participants

Open Source Contributions

OpenLLMetry

Open-source observability framework for LLM applications

  • • Resolved bug in Python data classes serialization (PR #2800)
  • • Fixed TypeError in OpenAI embeddings metrics handler (PR #1836)
  • • Added automated tests and improved stability
Pinecone Canopy

RAG framework and context engine for retrieval-augmented generation systems

View Commit
Khoj AI

Open-source AI search and personal assistant platform

  • • Added copy-to-clipboard support for references in the web app reference panel
  • • Implemented markdown bullet-list formatting for copied references (notes, online, and code)
  • • Contribution merged into master as PR #1144 with follow-up UX polish
View Merged PR #1144

Certifications & Professional Development

Understanding and Applying Text Embeddings

DeepLearning.AI - November 2024

A comprehensive short course on the end-to-end development of applications using text embeddings.

Key Topics:
  • • Fundamentals of creating, understanding, and visualizing embedding spaces
  • • Leveraging embeddings for practical applications like semantic search and retrieval
  • • Building a complete Q&A system (Retrieval-Augmented Generation) using Google's Vertex AI
View Certificate

Get In Touch

Interested in research collaboration, academic inquiries, or professional opportunities?

koshik.debanath@gmail.com
+8801855675763

Based in Rajshahi, Bangladesh | Available for remote collaboration