Notes

Thoughts and notes

Long-form writing on ML engineering, model evaluation, responsible deployment, technical tutorials, and industry observations.

AI EthicsOpinion

Mar 17, 2026 • 6 min read • The Pragmatic MLer

The Three Horizons of Epistemic Change

AI-generated text is epistemically different from human text, and detection methods face inherent limitations. Three distinct phases emerge as synthetic content accumulates in our information systems over time.

Read Article

AI EthicsOpinion

Mar 3, 2026 • 5 min read • The Pragmatic MLer

The Noise That Looks Like Signal

If AI-generated content is epistemically different from human content, can't we just detect and filter it? A look at why detection tools face fundamental challenges as language models keep improving.

Read Article

Machine LearningOpinion

Feb 20, 2026 • 6 min read • The Pragmatic MLer

Your Mistakes Are More Valuable Than You Think

A counterintuitive proposition: one of the most valuable properties of training data is human error — not random error, but the structured, systematic, informative errors Gerd Gigerenzer called 'good errors.'

Read Article

Machine LearningOpinion

Feb 17, 2026 • 5 min read • The Pragmatic MLer

The Photocopier Was the Wrong Metaphor

When people explain model collapse, they reach for the photocopy-of-a-photocopy analogy. It captures iterative degradation — but it frames the problem in a way that limits how we think about solutions.

Read Article

AI EthicsOpinion

Feb 12, 2026 • 6 min read • The Pragmatic MLer

We're Not Just Degrading AI. We're Reshaping Human Knowledge.

There's been significant attention on model collapse — AI models trained on AI output that gradually degrade. But there's a more consequential question underneath it.

Read Article

NLPTutorial

2023 • 12 min read • Towards Data Science

Domain Adaptation: Fine-Tune Pre-Trained NLP Models

A comprehensive guide to fine-tuning pre-trained NLP models for improved performance in specialized domains — covering theoretical frameworks, baseline evaluation, fine-tuning strategies, and result analysis.

Read Article

NLPTutorial

2023 • 10 min read • Towards Data Science

Practical Introduction to Transformer Models: BERT

A hands-on tutorial on using BERT for sentiment analysis — walking through the transformer architecture and demonstrating practical implementation for text classification tasks.

Read Article

Machine LearningEngineering

2022 • 9 min read • Towards Data Science

6 Steps Towards a Successful Machine Learning Project

A structured framework for approaching machine learning projects end-to-end — from problem definition and data collection through model development, evaluation, and deployment.

Read Article

Machine LearningTutorial

2020 • 8 min read • Towards Data Science

Recommendation System in Python: LightFM

A practical walkthrough of building a book recommendation system using LightFM — covering data preparation, hybrid matrix factorization, model training, and generating personalized recommendations.

Read Article

NLPTutorial

2019 • 10 min read • Towards Data Science

Topic Modeling in Python: Latent Dirichlet Allocation (LDA)

An end-to-end guide to topic modeling using LDA — covering the intuition behind generative probabilistic models and a practical implementation in Python with Gensim.

Read Article

NLPTutorial

2019 • 9 min read • Towards Data Science

Evaluate Topic Models: Latent Dirichlet Allocation (LDA)

A framework for quantitatively evaluating topic models through topic coherence metrics — with code templates in Python for systematic model selection and validation.

Read Article

NLPTutorial

2019 • 7 min read • Towards Data Science

Building Blocks: Text Pre-Processing

Foundational text pre-processing concepts for statistical NLP — tokenization, stemming, lemmatization, and stop-word removal — with practical Python implementations.

Read Article

NLPLanguage Models

2019 • 8 min read • Towards Data Science

Language Models: N-Gram

A step into statistical language modeling — explaining how n-gram models assign probabilities to word sequences and their role as building blocks for modern NLP systems.

Read Article

For writing updates, speaking notes, or ML engineering briefs, send a quick note.

Request Updates