Managing Duplicate Records in Salesforce

Which solution is best for your organization?

There’s a reason they call it “dirty data.” Inaccurate, incomplete, or inconsistent data can derail your fundraising. Duplicate records, in particular, can make a fundraiser’s job even harder. Consider these frustrating situations: 

  • The year-end fundraising appeal is ready to be mailed when you discover a problem. The Salesforce donor list that was updated and cleaned just a month ago now has a considerable number of duplicate records. 
  • You run a Salesforce donor report to build your donor solicitation plan for the coming year. Due to duplicate records, you need to upload the report into a worksheet. You spend hours massaging the data before you can get to the heart of the analysis. 
  • In a meeting to review a major donor’s giving history, she informs you that a number of donor gifts were missing. You later discover that these were credited to a duplicate record.  

Have you experienced these frustrations, or situations like them? Chances are, you have. When you moved over to Salesforce, you thought the Salesforce data management capabilities would solve your problems. What’s happening?

The reality is that while Salesforce offers standard duplicate management features, these tools can fall short. This is particularly true when dealing with complex, large, or rapidly growing datasets. So, what’s the best path forward to clean, duplicate-free data? 

In this article, we explore the strengths and limitations of Salesforce’s built-in duplicate management and how you can leverage additional strategies and tools to improve data accuracy, streamline workflows, and save valuable time.

Understanding Salesforce’s Duplicate Management: A Brief Overview

Salesforce provides two components for managing duplicates that work in conjunction: Matching Rules and Duplicate Rules.

  • Matching Rules define how records are compared to identify potential duplicates. These rules compile a list of fields to compare between records and can either be set to match exactly or to use fuzzy logic. Think of the exact match as true or false, black or white. Fuzzy logic allows for shades of truth.  An exact match search can find duplicates when a name is exactly the same, like Jennifer Jones. With fuzzy logic, Salesforce can distinguish between Jennifer Jones, Jenny Jones, and Jen Jones-Smith.  
  • Duplicate Rules govern what happens after the duplicate is identified. They allow for specific actions, such as blocking the creation of duplicates, alerting the user of the potential duplicate, or reporting the issue for manual review and merging at a later time.

While these tools can be effective in basic scenarios, there are limitations, especially as your database grows or when handling specific use cases, such as when there is inconsistent data entry or when you are integrating Salesforce with other applications – like event management, grassroots advocacy, or email marketing applications. 

Graphic of people looking at data with magnifying glass

When Salesforce Standard Duplicate Management Falls Short

1. Complex Data and Fuzzy Matching

Salesforce’s fuzzy matching capabilities, while useful, are not always smart enough to uncover duplicates. For example, industries with repetitive terms like “hospital” or “research institute,” or organizations with frequent use of acronyms (e.g., “NYU” vs. “New York University”), often face issues with fuzzy logic in the matching rules.  

In addition, Salesforce struggles with cross-field matching. This applies, for instance, when emails are stored in multiple fields and are not compared to one another (e.g. personal email cannot be matched against alternate email). 

2. Large Datasets and Increased Duplicate Frequency

As you might expect, when the volume of data in Salesforce increases, so does the likelihood of encountering duplicate records. Larger datasets make managing duplicates more challenging for the frequency alone, but remember that the number of false duplicates also increases in this scenario. This means every time you run your duplicate review, you’ll need to remember which matches you’ve already ruled out and disregard them again (e.g. John Smith Sr. and John Smith Jr. are not duplicate contacts). 

In these scenarios, more sophisticated tools – which allow you to disregard false duplicates permanently and offer the ability to set bulk rules for quick merging, whether manual or automatic – are essential. Depending on the tool being used, it also becomes easier to delegate responsibility for data cleaning, set specific permissions for staff, and restore data when errors are made during cleanup.

3. Data Entry Challenges: Volunteer Data, Imports, and Middleware

For organizations that rely on volunteer data entry, bulk imports, or middleware (e.g. Zapier, Mulesoft, Cazoomi) to bring data into Salesforce, the frequency of duplicates often increases. Without strong controls in place and consistent data cleanup, this can lead to a chaotic and inefficient database, progressively rendering your data less useful. 

These orgs have two points of intervention to consider: the point of entry and the point of staff review. Both are important and there is a tradeoff between them. The less validation you enforce at the point of entry, the greater the time commitment at the point of staff review and vice versa. The balance between these is an internal decision with no right or wrong answer, granting you are not reducing one without increasing the other to maintain the integrity of your data.

In each case, since frequent data cleanup is an integral part of the workflow, automating some of the rules for merging, formatting, and data cleanup can save significant time and reduce the chance of human error.

How to Address These Challenges: Advanced Solutions for Duplicate Management

Given the limitations of Salesforce’s native tools, many organizations turn to third-party apps and advanced strategies to manage duplicates more effectively. Here are some common approaches:

1. Integrating More Sophisticated Data Tools at Data Entry

If your organization uses more error-prone data entry systems, you might start by reducing duplicates before they occur. While traditional validation tools in Salesforce are best used for staff users performing manual data entry, sophisticated solutions like Omatic provide enhanced data validation and can help prevent duplicates before they reach your system, even when utilizing several integrations and/or mass imports.

2. Leveraging Third-Party Apps for Duplicate Cleanup

For organizations looking for additional features and customization, there are various third-party applications that can help automate and streamline the data cleanup process. These range from freemium apps like DupeChecker to midrange solutions like Apsona to more robust solutions like Cloudingo, Plauti Duplicate Checker, and DataGroomr, which are designed to handle larger datasets and more complex duplicate management needs.

Each of these tools offers different feature levels, with annual costs for paid products ranging from just under $1k for basic applications to $2k and $3k for medium-high-cost applications.

3. Bulk Edit and Mass Data Cleanup

Beyond duplicate cleanup, organizations with complex data needs or error-prone entry systems may also find themselves needing mass editing capabilities as well. Some third-party tools, such as Apsona, also offer bulk editing features that allow administrators to clean up large numbers of records quickly and efficiently. 

4. Continuous Monitoring and Training

The effectiveness of any data cleanup strategy depends not only on the tools used but also on how well the staff is trained to implement and maintain these processes. Regular training sessions and clear data entry protocols are essential for keeping data clean over time. Identifying a dedicated staff member to oversee data cleaning efforts can also help ensure consistency and accountability.

What’s Right for Your Organization

A graph of the scale of data management needs for a nonprofit using Salesforce, from standard Salesforce tools to medium-high cost applications

Next Generation Data Management

Salesforce is rapidly evolving, with ongoing improvements in tools for managing duplicate data and enhancing overall data quality. One of the most significant changes on the horizon is the increasing integration of artificial intelligence (AI) into the platform. For many nonprofits, this will incorporate predictive donor engagement and automated client support, both of which will rely heavily on high-quality, accurate data to be effective. By investing in strong data management practices now, your organization will be well-positioned to maximize your impact in the future.

Conclusion: Keeping Salesforce Data Clean at Scale

Maintaining a clean and accurate Salesforce database is an ongoing need but it is well worth the effort.  While Salesforce offers basic tools for duplicate management, more advanced solutions may be necessary to ensure data quality at scale.

By leveraging third-party apps, integrating sophisticated tools at the data entry point, and establishing clear data governance practices, you can significantly reduce the occurrence of duplicates, improve the overall efficiency of Salesforce, and maintain accurate and insightful data analytics.

Ultimately, the key is to regularly assess your organization’s unique data needs and implement the appropriate solutions that align with your long-term goals for data integrity.

The iMission team can help your nonprofit find a solution that helps you establish – and maintain – clean data within your Salesforce instance. Schedule a free consultation with one of our Salesforce Consultants today.

Related Blogs