Free · No signup · Runs in your browser

Fuzzy Dedupe

Merge rows that are the same value wearing different clothes — "Acme, Inc." and "acme inc" collapse to one.

01 · How it works

Three steps, then done.

Exact dedupe misses rows that differ by a stray capital, a double space, or a trailing period. Fuzzy dedupe normalizes the key column first, then drops later rows whose normalized key was already seen — so the first occurrence of each group wins.

1

Pick the key column

Choose the column that identifies a row — an email, a company name, a SKU. Two rows are near-duplicates when this column matches after normalization, regardless of the other columns.

2

Choose match strength

Strict ignores letter case and collapses runs of whitespace. Loose does all that and also strips punctuation, so "O'Brien" and "OBrien" or "Acme, Inc." and "Acme Inc" merge.

3

Keep the first, drop the rest

The first row in each near-duplicate group is kept in its original form; later matches are removed. You get a count of rows dropped and rows kept.

02 · Why ours

Why fuzzy beats exactfuzzy

Real-world keys are messy. The same customer, vendor, or product shows up spelled three slightly different ways across exports. Exact-match dedupe leaves all three; fuzzy dedupe folds them into one.

  • 01

    Catches human entry drift

    Hand-typed data picks up stray capitals, double spaces, and trailing periods. Normalizing the key before comparing means those cosmetic differences stop counting as distinct rows.

  • 02

    Two strengths, your call

    Strict is conservative — it only ignores case and spacing. Loose is aggressive — it also drops punctuation. Pick the level that matches how dirty your key really is.

  • 03

    First occurrence wins

    Rows are processed top to bottom, so the earliest version of each key survives untouched. Sort your file first if you want a particular row to be the keeper.

  • 04

    Private by construction

    Everything runs in your browser with plain JavaScript. No upload, no account, no server round-trip — close the tab and the data is gone.

""Acme, Inc.", "acme inc", and "ACME Inc." are three spellings of one company. Fuzzy dedupe keeps the first and drops the other two."
Why near-duplicate keys slip past exact matching
03 · FAQ

fuzzy dedupe questions.

What counts as a near-duplicate?
Two rows whose key column matches after normalization. Strict strength lowercases the key and collapses whitespace; loose strength additionally removes punctuation. Only the key column is compared — the other columns can differ freely.
The first one. Rows are scanned top to bottom and the earliest occurrence of each normalized key is kept in its original, unmodified form. Every later match is dropped. If you need a specific row to win, sort your file before running the tool.
No. Normalization is only used internally to decide which rows match. The rows that survive are written out exactly as they came in — original case, spacing, and punctuation intact.
Strict ignores case and collapses runs of spaces, so "Acme Inc" and "acme inc" match. Loose does that and also strips punctuation, so "Acme, Inc." matches "Acme Inc" too. Use loose when your key has inconsistent commas, periods, or apostrophes.
No. The entire transform runs client-side in your browser. Your CSV is parsed, deduped, and rebuilt locally — it never touches a server, and there's no account or tracking.