All notes
Data February 2026 · 4 min read

Your data isn't messy — it's undocumented

"Messy data" is usually a documentation problem wearing a costume. Fix the labels and most of the mess turns out to be meaning.

When people say their data is messy, they often mean they no longer remember what it means. Which column is authoritative? What does a blank cell signify? Is “closed” the same as “done”?

Meaning before cleanup

Before deduping a single row, I write down what each field is supposed to mean and who decides. Half the “errors” disappear once everyone agrees on definitions. The rest become genuinely fixable.

Clean data is just data everyone agrees about.

Recognize your own tools here?

Tell me what's not working and I'll tell you honestly if I can help.

Work with me