{"id":54145,"date":"2026-02-01T12:23:09","date_gmt":"2026-02-01T18:23:09","guid":{"rendered":"https:\/\/heartbeat.ai\/healthcare\/how-to-dedupe-a-provider-list-npi\/"},"modified":"2026-02-27T13:28:40","modified_gmt":"2026-02-27T19:28:40","slug":"how-to-dedupe-a-provider-list-npi","status":"publish","type":"post","link":"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/","title":{"rendered":"Dedupe provider list by NPI: recruiter-proof SOP + CSV rules"},"content":{"rendered":"<p><img decoding=\"async\" loading=\"false\" class=\"aligncenter\" src=\"http:\/\/hc.heartbeat.ai\/wp-content\/webp-express\/webp-images\/uploads\/2026\/02\/how-to-dedupe-a-provider-list-npi-6ca39603.png.webp\" alt=\"54144\" \/><\/p>\n<h1>Dedupe provider list by NPI (recruiter-proof SOP + CSV rules)<\/h1>\n<p><strong>Ben Argeband, Founder &amp; CEO of Heartbeat.ai<\/strong> \u2014 Very operational; reduce rework and candidate annoyance.<\/p>\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_65 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\r\n<div class=\"ez-toc-title-container\">\r\n<p class=\"ez-toc-title\" >What&rsquo;s on this page:<\/p>\r\n<span class=\"ez-toc-title-toggle\"><\/span><\/div>\r\n<nav><ul class='ez-toc-list ez-toc-list-level-1' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#Who_this_is_for\" title=\"Who this is for\">Who this is for<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#Quick_Answer\" title=\"Quick Answer\">Quick Answer<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#Framework_The_%E2%80%9COne_Person_One_Record%E2%80%9D_Rule_stop_double-tapping_candidates\" title=\"Framework: The \u201cOne Person, One Record\u201d Rule: stop double-tapping candidates\">Framework: The \u201cOne Person, One Record\u201d Rule: stop double-tapping candidates<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#Step-by-step_method\" title=\"Step-by-step method\">Step-by-step method<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#Step_0_Make_a_safe_working_copy_and_add_audit_columns\" title=\"Step 0: Make a safe working copy and add audit columns\">Step 0: Make a safe working copy and add audit columns<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#Step_1_Normalize_NPI_primary_key\" title=\"Step 1: Normalize NPI (primary key)\">Step 1: Normalize NPI (primary key)<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#Step_15_Separate_individual_vs_organization_records_before_dedupe\" title=\"Step 1.5: Separate individual vs organization records before dedupe\">Step 1.5: Separate individual vs organization records before dedupe<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-8\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#Step_2_Group_by_NPI_and_choose_a_survivor_record\" title=\"Step 2: Group by NPI and choose a survivor record\">Step 2: Group by NPI and choose a survivor record<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-9\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#Step_3_Fallback_keys_when_NPI_is_missing_or_invalid\" title=\"Step 3: Fallback keys when NPI is missing or invalid\">Step 3: Fallback keys when NPI is missing or invalid<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-10\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#Step_4_Build_and_keep_the_duplicate_log_non-negotiable\" title=\"Step 4: Build and keep the duplicate log (non-negotiable)\">Step 4: Build and keep the duplicate log (non-negotiable)<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-11\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#Diagnostic_Table\" title=\"Diagnostic Table:\">Diagnostic Table:<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-12\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#Weighted_Checklist\" title=\"Weighted Checklist:\">Weighted Checklist:<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-13\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#Outreach_Templates\" title=\"Outreach Templates:\">Outreach Templates:<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-14\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#Template_1_Apology_reset_same_person_duplicate_record\" title=\"Template 1: Apology + reset (same person, duplicate record)\">Template 1: Apology + reset (same person, duplicate record)<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-15\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#Template_2_Internal_note_to_recruiter_handoff_after_dedupe\" title=\"Template 2: Internal note to recruiter (handoff after dedupe)\">Template 2: Internal note to recruiter (handoff after dedupe)<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-16\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#Template_3_Source_escalation_bad_upstream_feed\" title=\"Template 3: Source escalation (bad upstream feed)\">Template 3: Source escalation (bad upstream feed)<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-17\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#Common_pitfalls\" title=\"Common pitfalls\">Common pitfalls<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-18\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#How_to_improve_results\" title=\"How to improve results\">How to improve results<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-19\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#1_Implement_the_CSV_RULES_worksheet_repeatable_and_auditable\" title=\"1) Implement the CSV_RULES worksheet (repeatable and auditable)\">1) Implement the CSV_RULES worksheet (repeatable and auditable)<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-20\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#2_Key_choice_matrix_precision_vs_coverage\" title=\"2) Key choice matrix (precision vs coverage)\">2) Key choice matrix (precision vs coverage)<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-21\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#3_Add_measurement_so_you_can_prove_dedupe_is_working\" title=\"3) Add measurement so you can prove dedupe is working\">3) Add measurement so you can prove dedupe is working<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-22\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#4_Export-ready_output_what_you_hand_to_recruiters_or_upload\" title=\"4) Export-ready output (what you hand to recruiters or upload)\">4) Export-ready output (what you hand to recruiters or upload)<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-23\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#5_Where_Heartbeatai_fits_after_dedupe\" title=\"5) Where Heartbeat.ai fits after dedupe\">5) Where Heartbeat.ai fits after dedupe<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-24\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#Legal_and_ethical_use\" title=\"Legal and ethical use\">Legal and ethical use<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-25\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#Evidence_and_trust_notes\" title=\"Evidence and trust notes\">Evidence and trust notes<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-26\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#FAQs\" title=\"FAQs\">FAQs<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-27\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#Is_NPI_always_the_best_way_to_dedupe_provider_records\" title=\"Is NPI always the best way to dedupe provider records?\">Is NPI always the best way to dedupe provider records?<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-28\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#What_should_I_do_when_two_rows_have_the_same_NPI_but_different_emails_or_phones\" title=\"What should I do when two rows have the same NPI but different emails or phones?\">What should I do when two rows have the same NPI but different emails or phones?<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-29\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#What_do_I_do_with_rows_that_fail_NPI_validation\" title=\"What do I do with rows that fail NPI validation?\">What do I do with rows that fail NPI validation?<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-30\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#How_do_I_prevent_dedupe_from_breaking_recruiter_ownership_and_attribution\" title=\"How do I prevent dedupe from breaking recruiter ownership and attribution?\">How do I prevent dedupe from breaking recruiter ownership and attribution?<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-31\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#Whats_the_minimum_I_need_to_keep_for_an_audit_trail\" title=\"What\u2019s the minimum I need to keep for an audit trail?\">What\u2019s the minimum I need to keep for an audit trail?<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-32\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#Next_steps\" title=\"Next steps\">Next steps<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-33\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#About_the_Author\" title=\"About the Author\">About the Author<\/a><\/li><\/ul><\/nav><\/div>\r\n<h2><span class=\"ez-toc-section\" id=\"Who_this_is_for\"><\/span>Who this is for<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Ops and recruiters cleaning lists before outreach\/enrichment. If you\u2019re about to upload a file, assign recruiter ownership, or start outreach, this SOP prevents double-taps, duplicate submissions, and reporting noise.<\/p>\n<p>Scope: you have a spreadsheet\/CSV with some mix of <strong>NPI<\/strong>, name, specialty, practice location, and sometimes a <strong>state license<\/strong>. We\u2019ll treat this as <strong>identity resolution<\/strong>: deciding when two rows represent the same provider, then keeping one survivor record with an audit trail.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Quick_Answer\"><\/span>Quick Answer<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<dl>\n<dt>Core Answer<\/dt>\n<dd>To dedupe provider list by NPI, normalize NPI to 10 digits, group by NPI, select one survivor per group, and log duplicates with source and recency.<\/dd>\n<dt>Key Insight<\/dt>\n<dd>NPI is the safest primary key; when it\u2019s missing or invalid, fall back to state license, then name+city with strict normalization and a duplicate log.<\/dd>\n<dt>Best For<\/dt>\n<dd>Ops and recruiters cleaning lists before outreach\/enrichment.<\/dd>\n<\/dl>\n<blockquote>\n<p><strong>Compliance &amp; Safety<\/strong><\/p>\n<p>This method is for legitimate recruiting outreach only. Always respect candidate privacy, opt-out requests, and local data laws. Heartbeat does not provide medical advice or legal counsel.<\/p>\n<\/blockquote>\n<h2><span class=\"ez-toc-section\" id=\"Framework_The_%E2%80%9COne_Person_One_Record%E2%80%9D_Rule_stop_double-tapping_candidates\"><\/span>Framework: The \u201cOne Person, One Record\u201d Rule: stop double-tapping candidates<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Duplicates don\u2019t just waste time. They create candidate annoyance (multiple calls\/emails), internal confusion (two recruiters \u201cown\u201d the same physician), and bad economics (more dials per submittal, lower connectability, and messy attribution).<\/p>\n<p><strong>Deliverables when you follow this SOP:<\/strong><\/p>\n<ul>\n<li><strong>SURVIVORS<\/strong>: one row per provider identity, ready for outreach or enrichment.<\/li>\n<li><strong>DUPLICATES_LOG<\/strong>: a mapped audit trail (duplicate_of, tier, reason, source, timestamp).<\/li>\n<li><strong>Before\/after snapshot<\/strong>: duplicate rate and collision rate so ops can prove the cleanup worked.<\/li>\n<\/ul>\n<ul>\n<li><strong>One identity<\/strong> (the person) can have many attributes (emails, phones, locations, licenses).<\/li>\n<li>Your working file should have <strong>one survivor row<\/strong> per identity, plus a <strong>duplicate log<\/strong> that preserves where the other rows came from.<\/li>\n<li>Every dedupe decision should be explainable later (ops, compliance, recruiter handoffs).<\/li>\n<\/ul>\n<p>In this SOP, <strong>recency<\/strong> is the tie-breaker: when two rows disagree, keep the most recently updated or most recently verified attribute, and record what changed.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Step-by-step_method\"><\/span>Step-by-step method<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<h3><span class=\"ez-toc-section\" id=\"Step_0_Make_a_safe_working_copy_and_add_audit_columns\"><\/span>Step 0: Make a safe working copy and add audit columns<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>Duplicate the file and add these columns (even if blank):<\/p>\n<ul>\n<li><strong>row_id<\/strong> (stable unique ID; if you don\u2019t have one, create it)<\/li>\n<li><strong>dedupe_key<\/strong> (the key you used to group)<\/li>\n<li><strong>dedupe_tier<\/strong> (NPI \/ state license \/ name+city)<\/li>\n<li><strong>survivor_flag<\/strong> (TRUE\/FALSE)<\/li>\n<li><strong>duplicate_of<\/strong> (row_id of survivor)<\/li>\n<li><strong>duplicate_reason<\/strong> (why you grouped them)<\/li>\n<li><strong>source<\/strong> (where the row came from)<\/li>\n<li><strong>last_updated<\/strong> (date field if you have it)<\/li>\n<\/ul>\n<p>This is how you <strong>log duplicates<\/strong> without losing traceability.<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Step_1_Normalize_NPI_primary_key\"><\/span>Step 1: Normalize NPI (primary key)<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p><strong>NPI definition:<\/strong> The National Provider Identifier (NPI) is a unique 10-digit identifier for health care providers in the U.S.<\/p>\n<p>Normalization rules (apply before grouping):<\/p>\n<ul>\n<li>Strip spaces, hyphens, and non-numeric characters.<\/li>\n<li>After cleaning, accept only values that are exactly 10 digits.<\/li>\n<li>Anything else is treated as missing NPI and routed to fallback tiers.<\/li>\n<\/ul>\n<p><strong>Copy\/paste rules<\/strong> (Google Sheets examples):<\/p>\n<ul>\n<li><strong>NPI_CLEAN (digits only)<\/strong>: <em>=REGEXREPLACE(A2,&#8221;[^0-9]&#8221;,&#8221;&#8221; )<\/em><\/li>\n<li><strong>NPI_VALID (10 digits)<\/strong>: <em>=IF(LEN(B2)=10,TRUE,FALSE)<\/em><\/li>\n<li><strong>NPI_FOR_KEY (blank if invalid)<\/strong>: <em>=IF(C2,B2,&#8221;&#8221;)<\/em><\/li>\n<\/ul>\n<p>Set <strong>dedupe_tier<\/strong> to \u201cNPI\u201d when NPI_VALID is TRUE.<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Step_15_Separate_individual_vs_organization_records_before_dedupe\"><\/span>Step 1.5: Separate individual vs organization records before dedupe<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>One common ops failure is mixing individual providers with organization records in the same file. Don\u2019t try to \u201cfix\u201d this during dedupe\u2014split it first.<\/p>\n<ul>\n<li>If your source labels record type, split into two tabs: <strong>INDIVIDUAL<\/strong> and <strong>ORGANIZATION<\/strong>, then dedupe INDIVIDUAL only with this SOP.<\/li>\n<li>If you don\u2019t have a label, use a practical rule: if the row\u2019s \u201cname\u201d field is clearly a facility\/clinic name (not a person name), route it to ORGANIZATION for separate handling and exclude it from this person-level dedupe run.<\/li>\n<\/ul>\n<p>This keeps your \u201cone person, one record\u201d logic clean and prevents nonsense groups.<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Step_2_Group_by_NPI_and_choose_a_survivor_record\"><\/span>Step 2: Group by NPI and choose a survivor record<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p><strong>Dedupe definition:<\/strong> Dedupe is the process of identifying multiple rows that represent the same real-world provider and retaining one survivor record while preserving an audit trail of the duplicates.<\/p>\n<p>Group rows where NPI_FOR_KEY matches. For each NPI group:<\/p>\n<ul>\n<li>Pick one <strong>survivor<\/strong> row (best contactability and freshest data).<\/li>\n<li>Mark all other rows as duplicates and set <strong>duplicate_of<\/strong> to the survivor\u2019s row_id.<\/li>\n<li>Set <strong>duplicate_reason<\/strong> to \u201cSame NPI after normalization\u201d.<\/li>\n<\/ul>\n<p>Survivor selection rules (recruiting order):<\/p>\n<ol>\n<li>Row with the most recently verified direct contact fields (if you track verification date or last_updated).<\/li>\n<li>Row with the most complete contact set (email + phone + location).<\/li>\n<li>Row with the newest <strong>last_updated<\/strong> date.<\/li>\n<li>If still tied, keep the row from your most trusted <strong>source<\/strong> and log why.<\/li>\n<\/ol>\n<p><strong>Multiple locations under one NPI (common):<\/strong><\/p>\n<ul>\n<li>Keep <strong>one identity<\/strong> (one survivor row_id) and treat locations as attributes.<\/li>\n<li>If your system needs multiple location rows, keep them linked to the same survivor row_id (do not create multiple identities).<\/li>\n<li>If two rows disagree on specialty, keep one identity and log the conflict; prefer the most recent source for the primary specialty field.<\/li>\n<\/ul>\n<h3><span class=\"ez-toc-section\" id=\"Step_3_Fallback_keys_when_NPI_is_missing_or_invalid\"><\/span>Step 3: Fallback keys when NPI is missing or invalid<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p><strong>Match key definition:<\/strong> A match key is the standardized identifier (or composite of fields) used to decide whether two records refer to the same provider.<\/p>\n<p>Not every row will have a valid NPI. Your fallback tiers should be strict and documented:<\/p>\n<ul>\n<li><strong>Tier 2: state license<\/strong>. Normalize by removing spaces\/punctuation and uppercasing. Group by (license state + license number). Set dedupe_tier to \u201cstate license\u201d.<\/li>\n<li><strong>Tier 3: name + city fallback<\/strong>. Use only when NPI and state license are missing. Normalize name and city\/state. Group by (last name + first initial + city + state). Set dedupe_tier to \u201cname+city\u201d.<\/li>\n<\/ul>\n<p>The trade-off is\u2026 the deeper you go into fallback tiers, the higher the risk of false positives (two different people who look similar). That\u2019s why you keep the duplicate log and keep the rules strict.<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Step_4_Build_and_keep_the_duplicate_log_non-negotiable\"><\/span>Step 4: Build and keep the duplicate log (non-negotiable)<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>Do not delete duplicates. Move them to a separate tab called <strong>DUPLICATES_LOG<\/strong> and keep these fields:<\/p>\n<ul>\n<li>survivor row_id<\/li>\n<li>duplicate row_id<\/li>\n<li>dedupe_tier used (NPI \/ state license \/ name+city)<\/li>\n<li>duplicate_reason<\/li>\n<li>source for both rows<\/li>\n<li>timestamp of the dedupe run<\/li>\n<\/ul>\n<p>This prevents ops failures later when you need to reconcile submissions, outreach history, or suppression lists.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Diagnostic_Table\"><\/span>Diagnostic Table:<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p><strong>Use this to diagnose what kind of duplicates you have and what to do next.<\/strong> (Visual note: Add \u201cdedupe rules\u201d table + sample formulas (Sheets). Add \u201crules table\u201d visual note.)<\/p>\n<div class=\"table-scroll\" style=\"overflow:auto;-webkit-overflow-scrolling:touch;width:100%\">\n<table class=\"separated-content\">\n<thead>\n<tr>\n<th>Symptom in your file<\/th>\n<th>Likely cause<\/th>\n<th>Best match key<\/th>\n<th>What to do (CSV rules)<\/th>\n<th>What to log<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Same NPI appears on multiple rows<\/td>\n<td>Multiple sources, repeated exports, or prior enrichment runs<\/td>\n<td>NPI<\/td>\n<td>Normalize to 10 digits; group; pick survivor by recency + completeness<\/td>\n<td>duplicate_of survivor row_id; reason \u201cSame NPI\u201d<\/td>\n<\/tr>\n<tr>\n<td>NPI missing on many rows<\/td>\n<td>Older lists, partial exports, inconsistent upstream fields<\/td>\n<td>state license<\/td>\n<td>Normalize license; group by state+license; keep best contact row<\/td>\n<td>Tier used + source priority<\/td>\n<\/tr>\n<tr>\n<td>Two rows share name but different cities<\/td>\n<td>Different people or provider moved<\/td>\n<td>NPI (if present), otherwise keep separate<\/td>\n<td>Do not dedupe unless NPI\/license matches; treat as separate identities<\/td>\n<td>Flag for review<\/td>\n<\/tr>\n<tr>\n<td>Same name+city but different NPIs<\/td>\n<td>Data entry error or two providers with similar names<\/td>\n<td>NPI<\/td>\n<td>Keep separate; validate NPI source; do not force consolidation<\/td>\n<td>Conflict note<\/td>\n<\/tr>\n<tr>\n<td>Facility\/clinic names mixed into person rows<\/td>\n<td>Organization records mixed into individual workflow<\/td>\n<td>Pre-filter<\/td>\n<td>Split into INDIVIDUAL vs ORGANIZATION tabs before dedupe<\/td>\n<td>Filter rule used<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<h2><span class=\"ez-toc-section\" id=\"Weighted_Checklist\"><\/span>Weighted Checklist:<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p><strong>Score each dedupe run before you upload\/enrich\/outreach.<\/strong> Total 100 points; if you\u2019re under 80, fix the file before outreach or enrichment.<\/p>\n<ul>\n<li><strong>30 pts<\/strong> \u2014 NPI normalized to 10 digits and validated; invalid NPIs routed to fallback tiers<\/li>\n<li><strong>15 pts<\/strong> \u2014 Individual vs organization records separated before dedupe<\/li>\n<li><strong>15 pts<\/strong> \u2014 Fallback keys implemented: state license normalization + name+city normalization<\/li>\n<li><strong>15 pts<\/strong> \u2014 Survivor rules documented (recency + completeness + trusted source)<\/li>\n<li><strong>15 pts<\/strong> \u2014 DUPLICATES_LOG created with duplicate_of mapping, reasons, and timestamp<\/li>\n<li><strong>10 pts<\/strong> \u2014 Suppression-ready: opt-outs and \u201cdo not contact\u201d flags carried to survivor record<\/li>\n<\/ul>\n<h2><span class=\"ez-toc-section\" id=\"Outreach_Templates\"><\/span>Outreach Templates:<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Use these when you discover you already contacted the provider under a different row. The goal is to stop double-taps and keep suppression clean.<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Template_1_Apology_reset_same_person_duplicate_record\"><\/span>Template 1: Apology + reset (same person, duplicate record)<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p><strong>Subject:<\/strong> Quick correction \u2014 one thread going forward<\/p>\n<p>Hi Dr. {{LastName}} \u2014 I realized we had you in our system twice and you may have gotten more than one message from us. Sorry about that.<\/p>\n<p>To keep it clean, I\u2019m consolidating to one record and one point of contact on our side. If you\u2019d prefer no outreach from us, reply \u201copt out\u201d and I\u2019ll suppress it.<\/p>\n<p>If you\u2019re open to a quick call, what\u2019s the best number and time window?<\/p>\n<p>\u2014 {{YourName}}, {{Role}}<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Template_2_Internal_note_to_recruiter_handoff_after_dedupe\"><\/span>Template 2: Internal note to recruiter (handoff after dedupe)<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p><strong>Subject:<\/strong> Dedupe complete for {{ListName}} \u2014 use survivor IDs only<\/p>\n<p>Team \u2014 dedupe provider list is complete. Use only rows where survivor_flag=TRUE. Duplicates are in DUPLICATES_LOG with duplicate_of mapping and reasons.<\/p>\n<p>Reminder: do not outreach from duplicate rows; it will double-tap the same physician and break attribution.<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Template_3_Source_escalation_bad_upstream_feed\"><\/span>Template 3: Source escalation (bad upstream feed)<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>We\u2019re seeing repeated duplicates caused by inconsistent NPI formatting and missing license fields. Please standardize NPI to 10 digits and include state license where available. We can share our CSV rules if helpful.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Common_pitfalls\"><\/span>Common pitfalls<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<ul>\n<li><strong>Using name-only dedupe.<\/strong> Name alone is not a key. If you must use name+city, keep it as a last resort and keep groups small and reviewable.<\/li>\n<li><strong>Deleting duplicates instead of logging them.<\/strong> When someone asks why a candidate got two emails, you need the audit trail.<\/li>\n<li><strong>Letting spreadsheets auto-format NPIs.<\/strong> Store cleaned NPI as text to avoid formatting issues.<\/li>\n<li><strong>Mixing organization records into person workflows.<\/strong> Split first; dedupe second.<\/li>\n<li><strong>Collapsing multiple locations into multiple identities.<\/strong> One provider can practice at multiple sites. Keep one identity record and store locations as attributes (or keep multiple location rows linked to the same survivor row_id).<\/li>\n<li><strong>Specialty conflicts across duplicates.<\/strong> Keep one identity; log conflicting specialty values and prefer the most recent source for the primary specialty field.<\/li>\n<li><strong>Not carrying suppression forward.<\/strong> If any duplicate row has an opt-out\/do-not-contact flag, the survivor must inherit it.<\/li>\n<\/ul>\n<h2><span class=\"ez-toc-section\" id=\"How_to_improve_results\"><\/span>How to improve results<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<h3><span class=\"ez-toc-section\" id=\"1_Implement_the_CSV_RULES_worksheet_repeatable_and_auditable\"><\/span>1) Implement the CSV_RULES worksheet (repeatable and auditable)<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>This is the distinct element you can standardize across every inbound list. The goal: every file gets the same normalization, the same keys, and the same log format.<\/p>\n<p><strong>CSV column spec (recommended headers)<\/strong>:<\/p>\n<ul>\n<li><strong>npi_raw<\/strong> (text)<\/li>\n<li><strong>npi_clean<\/strong> (text; digits only)<\/li>\n<li><strong>npi_valid<\/strong> (boolean)<\/li>\n<li><strong>license_state<\/strong> (text)<\/li>\n<li><strong>license_number_raw<\/strong> (text)<\/li>\n<li><strong>license_clean<\/strong> (text)<\/li>\n<li><strong>first_name<\/strong>, <strong>last_name<\/strong> (text)<\/li>\n<li><strong>city<\/strong>, <strong>state<\/strong> (text)<\/li>\n<li><strong>source<\/strong> (text)<\/li>\n<li><strong>last_updated<\/strong> (date)<\/li>\n<li><strong>dedupe_tier<\/strong> (text)<\/li>\n<li><strong>dedupe_key<\/strong> (text)<\/li>\n<li><strong>row_id<\/strong> (text)<\/li>\n<li><strong>survivor_flag<\/strong> (boolean)<\/li>\n<li><strong>duplicate_of<\/strong> (text)<\/li>\n<\/ul>\n<p><strong>Copy\/paste rules<\/strong> (Google Sheets examples):<\/p>\n<ul>\n<li><strong>LICENSE_CLEAN<\/strong>: <em>=UPPER(REGEXREPLACE(E2,&#8221;[^A-Za-z0-9]&#8221;,&#8221;&#8221;))<\/em><\/li>\n<li><strong>NAME_KEY<\/strong>: <em>=UPPER(H2)&amp;&#8221;|&#8221;&amp;LEFT(G2,1)<\/em><\/li>\n<li><strong>CITY_KEY<\/strong>: <em>=UPPER(I2)&amp;&#8221;|&#8221;&amp;J2<\/em><\/li>\n<li><strong>DEDUP_KEY<\/strong>: <em>=IF(C2,&#8221;NPI|&#8221;&amp;B2,IF(F2&lt;&gt;&#8221;&#8221;,&#8221;LIC|&#8221;&amp;D2&amp;&#8221;|&#8221;&amp;F2,&#8221;NAMECITY|&#8221;&amp;K2&amp;&#8221;|&#8221;&amp;L2))<\/em><\/li>\n<\/ul>\n<p><strong>Example dedupe_key outputs<\/strong> (what you should see in the file):<\/p>\n<ul>\n<li><strong>NPI tier<\/strong>: <em>NPI|1234567890<\/em><\/li>\n<li><strong>state license tier<\/strong>: <em>LIC|CA|A12345<\/em><\/li>\n<li><strong>name+city tier<\/strong>: <em>NAMECITY|SMITH|J|AUSTIN|TX<\/em><\/li>\n<\/ul>\n<p>Then sort by dedupe_key and your \u201cbest record\u201d signals (last_updated, completeness) so the survivor is always the first row in each group.<\/p>\n<h3><span class=\"ez-toc-section\" id=\"2_Key_choice_matrix_precision_vs_coverage\"><\/span>2) Key choice matrix (precision vs coverage)<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<div class=\"table-scroll\" style=\"overflow:auto;-webkit-overflow-scrolling:touch;width:100%\">\n<table class=\"separated-content\">\n<thead>\n<tr>\n<th>Tier<\/th>\n<th>When to use<\/th>\n<th>Main risk<\/th>\n<th>What to log<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>NPI<\/td>\n<td>Default when valid 10-digit NPI is present<\/td>\n<td>Bad upstream NPI entry or org\/person mixing<\/td>\n<td>Normalization applied + survivor rule used<\/td>\n<\/tr>\n<tr>\n<td>state license<\/td>\n<td>When NPI is missing\/invalid but license is present<\/td>\n<td>License formatting differences across sources<\/td>\n<td>State + cleaned license + source priority<\/td>\n<\/tr>\n<tr>\n<td>name+city<\/td>\n<td>Last resort only<\/td>\n<td>False positives (similar names)<\/td>\n<td>Exact normalization rules + review flag<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<h3><span class=\"ez-toc-section\" id=\"3_Add_measurement_so_you_can_prove_dedupe_is_working\"><\/span>3) Add measurement so you can prove dedupe is working<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>Measure this by\u2026 running a before\/after report on duplicates removed and outreach collisions prevented, then tracking downstream contactability metrics.<\/p>\n<ul>\n<li><strong>Duplicate Rate<\/strong> = duplicate rows \/ total rows (report per 1,000 rows). Track by source.<\/li>\n<li><strong>Collision Rate<\/strong> = outreach attempts to duplicate identities \/ total outreach attempts (report per 100 attempts).<\/li>\n<li><strong>Connect Rate<\/strong> = connected calls \/ total dials (per 100 dials).<\/li>\n<li><strong>Answer Rate<\/strong> = human answers \/ connected calls (per 100 connected calls).<\/li>\n<li><strong>Deliverability Rate<\/strong> = delivered emails \/ sent emails (per 100 sent emails).<\/li>\n<li><strong>Bounce Rate<\/strong> = bounced emails \/ sent emails (per 100 sent emails).<\/li>\n<li><strong>Reply Rate<\/strong> = replies \/ delivered emails (per 100 delivered emails).<\/li>\n<\/ul>\n<h3><span class=\"ez-toc-section\" id=\"4_Export-ready_output_what_you_hand_to_recruiters_or_upload\"><\/span>4) Export-ready output (what you hand to recruiters or upload)<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<ul>\n<li><strong>Outreach\/enrichment file<\/strong>: include only rows where survivor_flag=TRUE.<\/li>\n<li><strong>Required columns to keep<\/strong>: row_id, npi_clean (or NPI_FOR_KEY), dedupe_tier, source, last_updated, and your contact fields.<\/li>\n<li><strong>Audit file<\/strong>: keep DUPLICATES_LOG as a separate tab or separate CSV with survivor mapping and timestamp.<\/li>\n<\/ul>\n<h3><span class=\"ez-toc-section\" id=\"5_Where_Heartbeatai_fits_after_dedupe\"><\/span>5) Where Heartbeat.ai fits after dedupe<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>Once you have one record per provider, enrichment and outreach workflows behave: fewer wasted lookups, fewer duplicate touches, cleaner attribution. Heartbeat.ai can support this by keeping identity resolution consistent and <strong>ranked mobile numbers by answer probability<\/strong> so recruiters spend dials where they\u2019re most likely to connect.<\/p>\n<ul>\n<li><a href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/upload-a-physician-list-for-enrichment\/\">Upload a physician list for enrichment (workflow + file prep)<\/a><\/li>\n<li><a href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/bulk-physician-lookup\/\">Bulk physician lookup (when you need fast coverage)<\/a><\/li>\n<\/ul>\n<h2><span class=\"ez-toc-section\" id=\"Legal_and_ethical_use\"><\/span>Legal and ethical use<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>This is operational guidance for recruiting workflows, not legal advice. Use dedupe and identity resolution to reduce duplicate outreach and respect candidate preferences.<\/p>\n<ul>\n<li>Honor opt-outs immediately and propagate suppression flags to the survivor record.<\/li>\n<li>Limit access to raw lists; keep the duplicate log for audit, but don\u2019t over-share it.<\/li>\n<li>Only contact providers for legitimate recruiting opportunities and keep messages relevant.<\/li>\n<\/ul>\n<h2><span class=\"ez-toc-section\" id=\"Evidence_and_trust_notes\"><\/span>Evidence and trust notes<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>NPI is a federal identifier with official documentation. These sources define NPI and its role as a standard identifier:<\/p>\n<ul>\n<li><a href=\"https:\/\/nppes.cms.hhs.gov\/\">NPPES (CMS) \u2014 NPI source<\/a><\/li>\n<li><a href=\"https:\/\/www.cms.gov\/medicare\/regulations-guidance\/administrative-simplification\/national-provider-identifier-npi\">CMS \u2014 National Provider Identifier (NPI) overview<\/a><\/li>\n<\/ul>\n<p>For how Heartbeat.ai evaluates data quality and operational trust, review: <a href=\"http:\/\/heartbeat.ai\/resources\/trust-methodology\/\">Heartbeat.ai trust methodology and definitions<\/a>.<\/p>\n<p>If your workflow includes matching NPI to licensing, use this sibling playbook next: <a href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/npi-license-matching\/\">NPI to state license matching (ops workflow)<\/a>.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"FAQs\"><\/span>FAQs<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<h3><span class=\"ez-toc-section\" id=\"Is_NPI_always_the_best_way_to_dedupe_provider_records\"><\/span>Is NPI always the best way to dedupe provider records?<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>For individual providers, NPI is usually the cleanest primary key because it\u2019s designed as an identifier. When NPI is missing or invalid, use state license next, then name+city as a last resort with strict normalization and logging.<\/p>\n<h3><span class=\"ez-toc-section\" id=\"What_should_I_do_when_two_rows_have_the_same_NPI_but_different_emails_or_phones\"><\/span>What should I do when two rows have the same NPI but different emails or phones?<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>Keep one survivor record per NPI and treat emails\/phones as attributes. Prefer the most recent, most verified contact points, and keep the other values in notes or an attribute table if your system supports it. Always keep the duplicate log.<\/p>\n<h3><span class=\"ez-toc-section\" id=\"What_do_I_do_with_rows_that_fail_NPI_validation\"><\/span>What do I do with rows that fail NPI validation?<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>Route them to the fallback tiers: try state license first, then name+city only if you have no better key. Flag name+city groups for review and keep the dedupe_tier in your log so you can audit false positives later.<\/p>\n<h3><span class=\"ez-toc-section\" id=\"How_do_I_prevent_dedupe_from_breaking_recruiter_ownership_and_attribution\"><\/span>How do I prevent dedupe from breaking recruiter ownership and attribution?<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>Run dedupe before assignment. Use survivor row_ids as the only assignable records, and map duplicates to survivors in DUPLICATES_LOG so historical activity can be reconciled without creating new \u201cowners\u201d for the same person.<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Whats_the_minimum_I_need_to_keep_for_an_audit_trail\"><\/span>What\u2019s the minimum I need to keep for an audit trail?<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>Survivor row_id, duplicate row_id, dedupe tier used, reason, source, and timestamp. That\u2019s enough to explain what happened and to debug upstream list quality.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Next_steps\"><\/span>Next steps<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<ul>\n<li>If you\u2019re about to enrich: <a href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/upload-a-physician-list-for-enrichment\/\">prep and upload your physician list for enrichment<\/a> after you dedupe.<\/li>\n<li>If you need fast coverage first: <a href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/bulk-physician-lookup\/\">run a bulk physician lookup<\/a>, then dedupe the combined output before outreach.<\/li>\n<li>If you want this running as a repeatable workflow in Heartbeat.ai: <a href=\"https:\/\/heartbeat.ai\/signup\">create an account and set up your list workflow<\/a>.<\/li>\n<\/ul>\n<h2><span class=\"ez-toc-section\" id=\"About_the_Author\"><\/span><b>About the Author<\/b><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p><a href=\"http:\/\/heartbeat.ai\/resources\/author\/ben-argeband\"><span style=\"font-weight: 400;\">Ben Argeband<\/span><\/a><span style=\"font-weight: 400;\"> is the Founder and CEO of Swordfish.ai and Heartbeat.ai. With deep expertise in data and SaaS, he has built two successful platforms trusted by over 50,000 sales and recruitment professionals. Ben&#8217;s mission is to help teams find direct contact information for hard-to-reach professionals and decision-makers, providing the shortest route to their next win. Connect with Ben on <\/span><a href=\"https:\/\/www.linkedin.com\/in\/ben-m-argeband-2427a8a3\/\"><span style=\"font-weight: 400;\">LinkedIn<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p><script type=\"application\/ld+json\">{\"@context\":\"https:\/\/schema.org\",\"@type\":\"Article\",\"articleSection\":\"Resources\",\"author\":{\"@type\":\"Person\",\"name\":\"Ben Argeband\"},\"dateModified\":\"2026-01-05\",\"datePublished\":\"2026-01-05\",\"description\":\"A recruiter-proof SOP to dedupe provider list by NPI: normalize to 10 digits, use fallback keys (state license, name+city), pick a survivor by recency, and keep a duplicate audit log.\",\"headline\":\"Dedupe provider list by NPI: recruiter-proof SOP + CSV rules\",\"keywords\":[\"dedupe provider list by NPI\",\"dedupe provider list\",\"provider deduplication\",\"physician duplicate records\",\"NPI\",\"state license\",\"dedupe\",\"identity resolution\",\"recency\"],\"mainEntityOfPage\":{\"@id\":\"https:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/\",\"@type\":\"WebPage\"},\"publisher\":{\"@type\":\"Organization\",\"name\":\"Heartbeat.ai\"}}<\/script><br \/>\n<script type=\"application\/ld+json\">{\"@context\":\"https:\/\/schema.org\",\"@type\":\"FAQPage\",\"mainEntity\":[{\"@type\":\"Question\",\"acceptedAnswer\":{\"@type\":\"Answer\",\"text\":\"For individual providers, NPI is usually the cleanest primary key because it\u2019s designed as an identifier. When NPI is missing or invalid, use state license next, then name+city as a last resort with strict normalization and logging.\"},\"name\":\"Is NPI always the best way to dedupe provider records?\"},{\"@type\":\"Question\",\"acceptedAnswer\":{\"@type\":\"Answer\",\"text\":\"Keep one survivor record per NPI and treat emails\/phones as attributes. Prefer the most recent, most verified contact points, and keep the other values in notes or an attribute table if your system supports it. Always keep the duplicate log.\"},\"name\":\"What should I do when two rows have the same NPI but different emails or phones?\"},{\"@type\":\"Question\",\"acceptedAnswer\":{\"@type\":\"Answer\",\"text\":\"Route them to the fallback tiers: try state license first, then name+city only if you have no better key. Flag name+city groups for review and keep the dedupe_tier in your log so you can audit false positives later.\"},\"name\":\"What do I do with rows that fail NPI validation?\"},{\"@type\":\"Question\",\"acceptedAnswer\":{\"@type\":\"Answer\",\"text\":\"Run dedupe before assignment. Use survivor row_ids as the only assignable records, and map duplicates to survivors in DUPLICATES_LOG so historical activity can be reconciled without creating new \u201cowners\u201d for the same person.\"},\"name\":\"How do I prevent dedupe from breaking recruiter ownership and attribution?\"},{\"@type\":\"Question\",\"acceptedAnswer\":{\"@type\":\"Answer\",\"text\":\"Survivor row_id, duplicate row_id, dedupe tier used, reason, source, and timestamp. That\u2019s enough to explain what happened and to debug upstream list quality.\"},\"name\":\"What\u2019s the minimum I need to keep for an audit trail?\"}]}<\/script><\/p>","protected":false},"excerpt":{"rendered":"<p>A recruiter-proof SOP to dedupe provider list by NPI: normalize to 10 digits, use fallback keys (state license, name+city), pick a survivor by recency, and keep a duplicate audit log.<\/p>","protected":false},"author":5,"featured_media":54144,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_yoast_wpseo_focuskw":"dedupe provider list by NPI","_yoast_wpseo_title":"Dedupe provider list by NPI (SOP + CSV rules) | Heartbeat.ai","_yoast_wpseo_metadesc":"Normalize NPI to 10 digits, dedupe with fallback keys (state license, name+city), choose a survivor by recency, and keep a duplicate audit log before outreach or enrichment.","_custom_permalink":"provider-contact-data\/how-to-dedupe-a-provider-list-npi","footnotes":""},"categories":[1],"tags":[],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v22.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\r\n<title>Dedupe provider list by NPI (SOP + CSV rules) | Heartbeat.ai<\/title>\r\n<meta name=\"description\" content=\"Normalize NPI to 10 digits, dedupe with fallback keys (state license, name+city), choose a survivor by recency, and keep a duplicate audit log before outreach or enrichment.\" \/>\r\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\r\n<link rel=\"canonical\" href=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/\" \/>\r\n<meta property=\"og:locale\" content=\"en_US\" \/>\r\n<meta property=\"og:type\" content=\"article\" \/>\r\n<meta property=\"og:title\" content=\"Dedupe provider list by NPI (SOP + CSV rules) | Heartbeat.ai\" \/>\r\n<meta property=\"og:description\" content=\"Normalize NPI to 10 digits, dedupe with fallback keys (state license, name+city), choose a survivor by recency, and keep a duplicate audit log before outreach or enrichment.\" \/>\r\n<meta property=\"og:url\" content=\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/\" \/>\r\n<meta property=\"og:site_name\" content=\"Heartbeat.ai\" \/>\r\n<meta property=\"article:published_time\" content=\"2026-02-01T18:23:09+00:00\" \/>\r\n<meta property=\"article:modified_time\" content=\"2026-02-27T19:28:40+00:00\" \/>\r\n<meta property=\"og:image\" content=\"https:\/\/hc.heartbeat.ai\/wp-content\/uploads\/2026\/02\/how-to-dedupe-a-provider-list-npi-6ca39603.png\" \/>\r\n\t<meta property=\"og:image:width\" content=\"1024\" \/>\r\n\t<meta property=\"og:image:height\" content=\"1024\" \/>\r\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\r\n<meta name=\"author\" content=\"Ben Argeband\" \/>\r\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\r\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Ben Argeband\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"14 minutes\" \/>\r\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#article\",\"isPartOf\":{\"@id\":\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/\"},\"author\":{\"name\":\"Ben Argeband\",\"@id\":\"http:\/\/heartbeat.ai\/resources\/#\/schema\/person\/7b323ddce9b211907423482e2f9db173\"},\"headline\":\"Dedupe provider list by NPI: recruiter-proof SOP + CSV rules\",\"datePublished\":\"2026-02-01T18:23:09+00:00\",\"dateModified\":\"2026-02-27T19:28:40+00:00\",\"mainEntityOfPage\":{\"@id\":\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/\"},\"wordCount\":2695,\"commentCount\":0,\"publisher\":{\"@id\":\"http:\/\/heartbeat.ai\/resources\/#organization\"},\"image\":{\"@id\":\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/hc.heartbeat.ai\/wp-content\/uploads\/2026\/02\/how-to-dedupe-a-provider-list-npi-6ca39603.png\",\"articleSection\":[\"News\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/\",\"url\":\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/\",\"name\":\"Dedupe provider list by NPI (SOP + CSV rules) | Heartbeat.ai\",\"isPartOf\":{\"@id\":\"http:\/\/heartbeat.ai\/resources\/#website\"},\"primaryImageOfPage\":{\"@id\":\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#primaryimage\"},\"image\":{\"@id\":\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/hc.heartbeat.ai\/wp-content\/uploads\/2026\/02\/how-to-dedupe-a-provider-list-npi-6ca39603.png\",\"datePublished\":\"2026-02-01T18:23:09+00:00\",\"dateModified\":\"2026-02-27T19:28:40+00:00\",\"description\":\"Normalize NPI to 10 digits, dedupe with fallback keys (state license, name+city), choose a survivor by recency, and keep a duplicate audit log before outreach or enrichment.\",\"breadcrumb\":{\"@id\":\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#primaryimage\",\"url\":\"https:\/\/hc.heartbeat.ai\/wp-content\/uploads\/2026\/02\/how-to-dedupe-a-provider-list-npi-6ca39603.png\",\"contentUrl\":\"https:\/\/hc.heartbeat.ai\/wp-content\/uploads\/2026\/02\/how-to-dedupe-a-provider-list-npi-6ca39603.png\",\"width\":1024,\"height\":1024},{\"@type\":\"BreadcrumbList\",\"@id\":\"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/heartbeat.ai\/healthcare\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Dedupe provider list by NPI: recruiter-proof SOP + CSV rules\"}]},{\"@type\":\"WebSite\",\"@id\":\"http:\/\/heartbeat.ai\/resources\/#website\",\"url\":\"http:\/\/heartbeat.ai\/resources\/\",\"name\":\"Heartbeat.ai\",\"description\":\"\",\"publisher\":{\"@id\":\"http:\/\/heartbeat.ai\/resources\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"http:\/\/heartbeat.ai\/resources\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"http:\/\/heartbeat.ai\/resources\/#organization\",\"name\":\"Heartbeat.ai\",\"url\":\"http:\/\/heartbeat.ai\/resources\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"http:\/\/heartbeat.ai\/resources\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/hc.heartbeat.ai\/wp-content\/uploads\/2021\/04\/Heartbeat.ai-logo.png\",\"contentUrl\":\"https:\/\/hc.heartbeat.ai\/wp-content\/uploads\/2021\/04\/Heartbeat.ai-logo.png\",\"width\":704,\"height\":126,\"caption\":\"Heartbeat.ai\"},\"image\":{\"@id\":\"http:\/\/heartbeat.ai\/resources\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"http:\/\/heartbeat.ai\/resources\/#\/schema\/person\/7b323ddce9b211907423482e2f9db173\",\"name\":\"Ben Argeband\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"http:\/\/heartbeat.ai\/resources\/#\/schema\/person\/image\/\",\"url\":\"http:\/\/0.gravatar.com\/avatar\/6356f96884d5a313d758128b3d9aaef7?s=96&d=mm&r=g\",\"contentUrl\":\"http:\/\/0.gravatar.com\/avatar\/6356f96884d5a313d758128b3d9aaef7?s=96&d=mm&r=g\",\"caption\":\"Ben Argeband\"},\"url\":\"http:\/\/heartbeat.ai\/resources\/author\/ben-argeband\/\"}]}<\/script>\r\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Dedupe provider list by NPI (SOP + CSV rules) | Heartbeat.ai","description":"Normalize NPI to 10 digits, dedupe with fallback keys (state license, name+city), choose a survivor by recency, and keep a duplicate audit log before outreach or enrichment.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/","og_locale":"en_US","og_type":"article","og_title":"Dedupe provider list by NPI (SOP + CSV rules) | Heartbeat.ai","og_description":"Normalize NPI to 10 digits, dedupe with fallback keys (state license, name+city), choose a survivor by recency, and keep a duplicate audit log before outreach or enrichment.","og_url":"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/","og_site_name":"Heartbeat.ai","article_published_time":"2026-02-01T18:23:09+00:00","article_modified_time":"2026-02-27T19:28:40+00:00","og_image":[{"width":1024,"height":1024,"url":"https:\/\/hc.heartbeat.ai\/wp-content\/uploads\/2026\/02\/how-to-dedupe-a-provider-list-npi-6ca39603.png","type":"image\/png"}],"author":"Ben Argeband","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Ben Argeband","Est. reading time":"14 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#article","isPartOf":{"@id":"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/"},"author":{"name":"Ben Argeband","@id":"http:\/\/heartbeat.ai\/resources\/#\/schema\/person\/7b323ddce9b211907423482e2f9db173"},"headline":"Dedupe provider list by NPI: recruiter-proof SOP + CSV rules","datePublished":"2026-02-01T18:23:09+00:00","dateModified":"2026-02-27T19:28:40+00:00","mainEntityOfPage":{"@id":"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/"},"wordCount":2695,"commentCount":0,"publisher":{"@id":"http:\/\/heartbeat.ai\/resources\/#organization"},"image":{"@id":"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#primaryimage"},"thumbnailUrl":"https:\/\/hc.heartbeat.ai\/wp-content\/uploads\/2026\/02\/how-to-dedupe-a-provider-list-npi-6ca39603.png","articleSection":["News"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#respond"]}]},{"@type":"WebPage","@id":"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/","url":"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/","name":"Dedupe provider list by NPI (SOP + CSV rules) | Heartbeat.ai","isPartOf":{"@id":"http:\/\/heartbeat.ai\/resources\/#website"},"primaryImageOfPage":{"@id":"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#primaryimage"},"image":{"@id":"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#primaryimage"},"thumbnailUrl":"https:\/\/hc.heartbeat.ai\/wp-content\/uploads\/2026\/02\/how-to-dedupe-a-provider-list-npi-6ca39603.png","datePublished":"2026-02-01T18:23:09+00:00","dateModified":"2026-02-27T19:28:40+00:00","description":"Normalize NPI to 10 digits, dedupe with fallback keys (state license, name+city), choose a survivor by recency, and keep a duplicate audit log before outreach or enrichment.","breadcrumb":{"@id":"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#primaryimage","url":"https:\/\/hc.heartbeat.ai\/wp-content\/uploads\/2026\/02\/how-to-dedupe-a-provider-list-npi-6ca39603.png","contentUrl":"https:\/\/hc.heartbeat.ai\/wp-content\/uploads\/2026\/02\/how-to-dedupe-a-provider-list-npi-6ca39603.png","width":1024,"height":1024},{"@type":"BreadcrumbList","@id":"http:\/\/heartbeat.ai\/resources\/provider-contact-data\/how-to-dedupe-a-provider-list-npi\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/heartbeat.ai\/healthcare\/"},{"@type":"ListItem","position":2,"name":"Dedupe provider list by NPI: recruiter-proof SOP + CSV rules"}]},{"@type":"WebSite","@id":"http:\/\/heartbeat.ai\/resources\/#website","url":"http:\/\/heartbeat.ai\/resources\/","name":"Heartbeat.ai","description":"","publisher":{"@id":"http:\/\/heartbeat.ai\/resources\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"http:\/\/heartbeat.ai\/resources\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"http:\/\/heartbeat.ai\/resources\/#organization","name":"Heartbeat.ai","url":"http:\/\/heartbeat.ai\/resources\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"http:\/\/heartbeat.ai\/resources\/#\/schema\/logo\/image\/","url":"https:\/\/hc.heartbeat.ai\/wp-content\/uploads\/2021\/04\/Heartbeat.ai-logo.png","contentUrl":"https:\/\/hc.heartbeat.ai\/wp-content\/uploads\/2021\/04\/Heartbeat.ai-logo.png","width":704,"height":126,"caption":"Heartbeat.ai"},"image":{"@id":"http:\/\/heartbeat.ai\/resources\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"http:\/\/heartbeat.ai\/resources\/#\/schema\/person\/7b323ddce9b211907423482e2f9db173","name":"Ben Argeband","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"http:\/\/heartbeat.ai\/resources\/#\/schema\/person\/image\/","url":"http:\/\/0.gravatar.com\/avatar\/6356f96884d5a313d758128b3d9aaef7?s=96&d=mm&r=g","contentUrl":"http:\/\/0.gravatar.com\/avatar\/6356f96884d5a313d758128b3d9aaef7?s=96&d=mm&r=g","caption":"Ben Argeband"},"url":"http:\/\/heartbeat.ai\/resources\/author\/ben-argeband\/"}]}},"_links":{"self":[{"href":"http:\/\/heartbeat.ai\/resources\/wp-json\/wp\/v2\/posts\/54145"}],"collection":[{"href":"http:\/\/heartbeat.ai\/resources\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/heartbeat.ai\/resources\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/heartbeat.ai\/resources\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"http:\/\/heartbeat.ai\/resources\/wp-json\/wp\/v2\/comments?post=54145"}],"version-history":[{"count":1,"href":"http:\/\/heartbeat.ai\/resources\/wp-json\/wp\/v2\/posts\/54145\/revisions"}],"predecessor-version":[{"id":54449,"href":"http:\/\/heartbeat.ai\/resources\/wp-json\/wp\/v2\/posts\/54145\/revisions\/54449"}],"wp:featuredmedia":[{"embeddable":true,"href":"http:\/\/heartbeat.ai\/resources\/wp-json\/wp\/v2\/media\/54144"}],"wp:attachment":[{"href":"http:\/\/heartbeat.ai\/resources\/wp-json\/wp\/v2\/media?parent=54145"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/heartbeat.ai\/resources\/wp-json\/wp\/v2\/categories?post=54145"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/heartbeat.ai\/resources\/wp-json\/wp\/v2\/tags?post=54145"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}