Interactive Notebooks
Explore my collection of machine learning notebooks across Kaggle and Google Colab. From Bengali NLP models to advanced AI benchmarking, dive into hands-on implementations with real-world insights and performance analysis.
What You'll Discover
From cutting-edge Bengali NLP models to comprehensive AI benchmarking, explore real-world implementations with detailed analysis and performance insights
Bengali NLP & LLMs
TigerLLM benchmarking, Bengali LLaMA analysis, tokenization strategies, and language model evaluation
Model Evaluation & Benchmarking
Comprehensive model testing, performance analysis, reality checks, and capability assessment
Deep Learning Fundamentals
Attention mechanisms, positional encoding, sequence-to-sequence models, and neural architecture
Practical Machine Learning
Customer churn prediction, stock forecasting, time series analysis, and real-world applications
Data Processing & OCR
Bengali text extraction, PDF processing, tokenization comparison, and data preprocessing
Showing 9 of 9 notebooks
TigerLLM Testing and Benchmarking
Comprehensive capability assessment and benchmarking of md-nishat-008/TigerLLM-1B-it Bengali Language Model. Detailed evaluation of model performance, limitations, and usage recommendations.
Pre-training LLMs with HuggingFace
Complete guide to pre-training large language models using HuggingFace transformers library. Covers data preparation, model architecture, training strategies, and optimization techniques.
Bengali LLaMA Reality Check: hassanaliemon/bn_r-8b
Complete performance analysis of hassanaliemon/bn_rag_llama3-8b model. Discovering what it's actually good at - excels in creative tasks (4.9/5) but struggles with factual Q&A (0-25% accuracy).
Attention Mechanism - Positional Encoding
Deep dive into attention mechanisms and positional encoding using neural networks. Detailed visual understanding including softmax operations and transformer architecture components.
Corpus Bangla Dataset - BPE vs SentencePiece
Comparative analysis of Byte-Pair Encoding (BPE) and SentencePiece tokenizers trained on OSCAR Bengali dataset (4,601 examples). Determines optimal tokenization strategy for Bengali NLP fine-tuning.
Developing a Sequence-to-Sequence Model
Comprehensive guide to developing sequence-to-sequence models with BLEU score evaluation metrics. Complete implementation from data preprocessing to model evaluation.
Customer Churn Prediction
Telco customer churn prediction using ensemble methods including Random Forest, Decision Tree, and XGBoost. Complete pipeline from data analysis to model deployment with performance comparison.
Flusk OCR Testing for Extracting Bengali Data
Testing Flusk OCR capabilities for extracting Bengali text data from PDF documents. Comprehensive evaluation of OCR accuracy and performance for Bengali language processing.
Stock Forecasting using LSTM
Amazon stock price forecasting using Long Short-Term Memory (LSTM) neural networks. Time series analysis, data preprocessing, model training, and prediction visualization.
Ready to Dive In?
Explore interactive notebooks with real-world data, detailed analysis, and practical insights. All notebooks include comprehensive documentation and reproducible results.