The need for deduping non-parsed candidate profiles is often identified when using a resume parsing tool like DaXtra or any other of the resume parsing apps on the AppExchange.
Although these techniques are fantastic to extract data from the submitted CV’s, you will eventually get candidate profiles. where the data isn’t captured 100%.
-Some fields (like Last Name) will be populated with default values (because then are mandatory)
-Some fields (like Email) may be populated with Unique, but still default, values.
-Some fields (like Address) may have data parsed incorrectly, e.g. the state is put into the Country field.
-And other fields may be empty
The Candidates created as a result of this parsing, may be duplicates of candidates you already have in your database, but because of the “bad” data, it may be difficult to refine matching rules which will catch them all.
When performing a deduplication of your candidate records these non-parsed Candidates values may cause False Dupes to be identified, and you therefor need to be very careful with your reviewing and merging process in order NOT to merge candidates which aren’t duplicates after all.
Prepare you matching process
In performing any dedupe of salesforce records, there are 3 dimensions to take into considerations
- How do you identify the duplicates, Same Email, Same Name, Same Phone, Same Address, a multitude of combinations hereof?
- If 2 records are identified as duplicates, which one will you want as the new Master (survivor)? What is the logic that defines the selection of the Master?
- When you merge the records, are there examples where there are values on the dupe which you want to keep? E.g. if the Score or the Status of the Dupe is better on the Dupe, how can you make sure that you can keep this particular field value, while keeping everything else from the Master?
To identify and create the most effective and efficient process and plan, you will need to do a little analysis on your records.
Which fields would you want to incorporate in the special merge rules: Status fields, Score fields, Ownership etc.
How many records do you have with the different key-field values, e.g. Active Candidates, unqualified candidates etc? How may Customer records have you versus Prospect records?
Once you have an idea of the volumes, you may also start getting an idea on which records you want to prioritize. You might want to clean up you Active Candidates, before the In-active, which again could be cleaned up before the “non-parsed”.
When you are ready to start
With the numbers in mind, you can now choose between 2 strategies:
- Isolate and create segments of data where you know that you can treat the duplicates the same way.
a. Active Candidates matched against in-active candidates, letting the Actie Candidate become the Master.
b. Black-listed candidates, should always be the master, but if matched against an active Candidate, you might want to let the recruiters follow up on these cases.
c. Prospect records may be matched against Customer records, and where a match is found the Custom records will always be the master. - Consider how you can formulate logic (survivorship rules), which in comparing the 2 records can provide an outcome: Keep A or Keep B
By combining these 2 strategies you can start performing your deduplication and make your way through the dupes with priority and get rid of the large volume or most business-critical duplicates first, and leave the exceptional cases to the end for manual treatment.
Dedupe non-parsed candidates with DataTrim Dupe Alerts
In DataTrim Dupe Alerts we use filters to segmets the data for each matching process (called Alert).
When creating Filters, the SOQL syntax must be applied.
To check if you manage to get the single-quotes, parentheses and operators right, use the Check Filter button to validate the syntax.
It will also give you then number of records which will be compared. Which is a good validation point to see if you have included all the records you expected.
OBS: if you get : 20.000+, go to the setup and increate or remove the setting: SOQL Count Query (Max.) to get an accurate number.
Alert examples for deduping non-parsed candidates
Alert 1: Exact Same Email
Setting | Note |
Match Level: Medium | Normally we don’t recommend to set the Match Level lower then Tight, but when you combine this with the ExactMatchField (see below) you are reducing the number of potential duplicates being suggested |
Filter: | Secondary: (NOT Email like ‘notparsed%’) AND (NOT (FirstName = ‘N/A’ OR LastName LIKE ‘Not Parsed%’)) Master: (NOT Email like ‘notparsed%’) AND (FirstName = ‘N/A’ OR LastName LIKE ‘Not Parsed%’) |
Advanced Parameter: Candidates=Yes ExactMatchField=Email |
Use this Advanced Parameter to apply the classification rules for candidate records to get more accurate classification of the potential duplicated. By default, DataTrim Dupe Alerts will assume that the contact records are business contacts. |
See Screenshot from DataTrim Dupe Alerts |
Alert 2: Same Name, No Email
Setting | Note |
Match Level: Tight | Must Use Tight, as emails wont match |
Filter: | Secondary: (NOT Email like ‘notparsed%’ ) AND (NOT(FirstName = ‘N/A’ OR LastName LIKE ‘Not Parsed%’)) Master: (Email like ‘notparsed%’ ) AND (NOT(FirstName = ‘N/A’ OR LastName LIKE ‘Not Parsed%’)) |
Advanced Parameter: Candidates=Yes |
Use this Advanced Parameter to apply the classification rules for candidate records to get more accurate classification of the potential duplicated. By default, DataTrim Dupe Alerts will assume that the contact records are business contacts. |
See Screenshot from DataTrim Dupe Alerts (Including the Status filters) |
Alert 3a: Standard – Same Email
Setting | Note |
Match Level: Medium | Normally we don’t recommend to set the Match Level lower then Tight, but when you combine this with the ExactMatchField (see below) you are reducing the number of potential duplicates being suggested. |
Filter: | Secondary: (NOT Email like ‘notparsed%’ ) Master: (NOT Email like ‘notparsed%’ ) |
Advanced Parameter: Candidates=Yes ExactMatchField=Email |
Use this Advanced Parameter to apply the classification rules for candidate records to get more accurate classification of the potential duplicated. By default, DataTrim Dupe Alerts will assume that the contact records are business contacts. |
See Screenshot from DataTrim Dupe Alerts |
Alert 3b: Standard
Setting | Note |
Match Level: SuperTight or Tight | For an initial run, use SupeTight. Once the dupes are cleaned up, change the match level to tight and run it again. |
Filter: | Secondary: (NOT Email like ‘notparsed%’ ) Master: (NOT Email like ‘notparsed%’ ) |
Advanced Parameter: Candidates=Yes |
Use this Advanced Parameter to apply the classification rules for candidate records to get more accurate classification of the potential duplicated. By default, DataTrim Dupe Alerts will assume that the contact records are business contacts. |
Note that potential duplicates found by this alert will include those found by the alert above (Standard – Same Email). So don’t run this one until the duplicates from the alert above have been resolved.
Merge Recommendation – Formula field
Formula text: (for TargetRecruit customres replace Candidate_Status__c with the TargetRecruit field: . AVTRRT__Candidate_Status__c)
IF( AND(ISPICKVAL(TRIMDA__Contact1__r.Candidate_Status__c, “Placed”) , NOT ISPICKVAL(TRIMDA__Contact2__r.Candidate_Status__c, “Placed”)), “Merge Keep Survivor”,
IF( AND(ISPICKVAL(TRIMDA__Contact1__r.Candidate_Status__c, “Available”) , NOT OR(ISPICKVAL(TRIMDA__Contact2__r.Candidate_Status__c, “Placed”), ISPICKVAL(TRIMDA__Contact2__r.Candidate_Status__c, “Available”))), “Merge Keep Survivor”,
IF( AND(ISPICKVAL(TRIMDA__Contact1__r.Candidate_Status__c, “Inactive”) , NOT OR(ISPICKVAL(TRIMDA__Contact2__r.Candidate_Status__c, “Placed”), ISPICKVAL(TRIMDA__Contact2__r.Candidate_Status__c, “Available”), ISPICKVAL(TRIMDA__Contact2__r.Candidate_Status__c, “Inactive”))), “Merge Keep Survivor”,
IF( AND(ISPICKVAL(TRIMDA__Contact1__r.Candidate_Status__c, “New Record”) , NOT ISPICKVAL(TRIMDA__Contact2__r.Candidate_Status__c, “New Record”)), “Merge Keep Dupe”,
IF( AND(ISPICKVAL(TRIMDA__Contact1__r.Candidate_Status__c, “Inactive”) , OR(ISPICKVAL(TRIMDA__Contact2__r.Candidate_Status__c, “Placed”), ISPICKVAL(TRIMDA__Contact2__r.Candidate_Status__c, “Available”))), “Merge Keep Dupe”,
IF( AND(ISPICKVAL(TRIMDA__Contact1__r.Candidate_Status__c, “Available”) , ISPICKVAL(TRIMDA__Contact2__r.Candidate_Status__c, “Placed”)), “Merge Keep Dupe”,
“Keep Most Recently Updated”))))))
Candidate Status – Formula Field
Formula text:
Text(TRIMDA__Contact1__r. Candidate_Status__c) & ” <-> ” & Text(TRIMDA__Contact2__r. Candidate_Status__c)
or for TargetRecrit customers:
Text(TRIMDA__Contact1__r. AVTRRT__Candidate_Status__c) & ” <-> ” & Text(TRIMDA__Contact2__r. AVTRRT__Candidate_Status__c)
Learn More
DataTrim Dupe Alerts – Mastering Alert Filters
Deduplication of Candidate Records – Helicopter View
Upgrade to the latest version directly from the AppExchange
Contact Us for more information
Don’t hesitate to reach out to our support team if you run into any question regarding the creation and setting up of these alerts and formulas.
Contact Support