Merging contacts when linked to SFDC - wrong ID survives?


#1

I'll post the short-version here, but it's still rather long and rambling- please do ask for more details and I'll try and oblige!
We want to merge a lot of duplicate contacts in Salesforce, however we have the Hubspot sync enabled. A basic test of merging contacts in Salesforce often led to problems because the "wrong" hubspot contact would survive- it appears in some instances that the email addresses that match between a SF & HS contact are the opposite ones to where the salesforce ID's on the hubspot contacts are. e.g.

SFDC 12345 foo@foo
HS "ABC" foo@foo SFID 67890

SFDC 67890 bar@bar
HS "DEF" bar@bar SFID 12345

If we just merge directly in SF, we end up with contacts getting removed, depending on the specifics. I narrowed this down to the theory that we need to really merge on both sides so as to ensure the "Salesforce ID" on the hubspot contact remains the correct one for the surviving record in Salesforce.
So, I wrote a simple commandline tool that takes in a csv of "primary/secondary" Salesforce IDs. For each pair, it establishes the related hubspot ID's and uses the API to run the merge command. So in the example above, if I wanted to merge 67890 into 12345 so 12345 survived, my tool would work out that Hubspot we need to merge ABC into DEF so DEF survives (as it has the 12345 salesforce ID property)

What I'm finding though, is after the merge (I turn off the sync, merge in Hubspot, then in Salesforce), the hubspot contact seems to generally end up with the wrong salesforce ID (67890 in this case) - when the sync next executes after being enabled, the contact is deleted as that ID no longer exists in Salesforce.

The documentation wasn't too clear on merging- it says that in general the most recent property survives; but also says that when linked to Salesforce, the "primary" is the one that keeps syncing. That suggests to me that the Salesforce ID of the primary (surviving) contact is the one that should stay. But that's not what seems to happen.

Ideas? If there's another way people have undertaken mass merges when also linked to SFDC, please shout.


#2

Hi @jmb,

It's tough to really say anything definitive without taking a closer look at specific contacts, but is it possible that when you identify & merge the HubSpot contacts, you're selecting the contact with the 'primary' SalesforceID as the 'secondary' contact in HubSpot? That might explain the strange behavior.

In general you're right, the 'primary' contact will keep syncing. But if HubSpot doesn't detect a contact with the corresponding SalesforceID in Salesforce, then the sync will break. Depending on your settings, the contact could also be deleted in HubSpot. This sounds like it might be the case here.

I would recommend reaching out to Support so that you can dive into this with a specialist. It's difficult to troubleshoot something like this back-and-forth over a forum, and it sounds like the actual API related part isn't the root of the issue. You can reach Support by clicking the 'Help' widget in your portal or by calling 1 888-482-7768 x3.


#3

Thanks David- I'd already contacted support, but the agent suggested I posted here so someone could take a look! (I realise without specifics it's hard to diagnose).
My first thought was also that I was picking the "wrong" contacts but on doing some debugging it looked okay (I ran the tool without actually executing the merge and verified the properties).
I'll do a bit more digging around and see if I can get a slightly clearer view of what's happening. I've turned off delete in the sync settings for now, so hopefully we can at least resync the contacts somehow


#4

OK, I've done a couple more tests today - in one case the merge appeared to work as I expected, and the surviving linked contact was the right one. In the second case it failed, because the Salesforce ID from the losing contact is the one that survived. I'm at a loss to explain the differences though.
Some output containing the ID's (I've obfuscated the email addresses)

This one worked:
Data In: Salesforce ID's Email in Salesforce SF link back to Hubspot
Master 003b000000YsOrWAAV abc@example.com 8644276
Non-Master 003b000000fZGTRAA4 def@example2.com 5476166

Hubspot ID's retrieved from ContactsAPI > HubspotAPI:   SF ID before merging
Master      8644276             abc@example.com         003b000000YsOrWAAV
Non-Master  5476166             def@example2.com        003b000000fZGTRAA4
            
After merge:            SFID
Remains 8644276         003b000000YsOrWAAV
Link is intact as correct ID from Master is still in Hubspot.

This one did not work:
Data In: Salesforce ID's Email in Salesforce SF link back to Hubspot
Master 003b000000YsKiyAAF 123@example 6440868
Non-Master 003b000000Ys0IIAAZ def@example2 156068530

Hubspot ID's retrieved from ContactsAPI > HubspotAPI jump:          SF ID before merging
Master      6440868     123@example                             003b000000YsKiyAAF
Non-Master  156068530   def@example2                                003b000000Ys0IIAAZ
            
After merge:            SFID
Remains 6440868     003b000000Ys0IIAAZ

I don't see any obvious difference here, the email addresses in SF and HS seem to match, the ID's all matched, but in the second case, the surviving Salesforce ID on the hubspot contact is the wrong one.


#5

OK, I've added a bit more debug logging. In the first case the merge worked as expected, with the Salesforce ID on the surviving contact being that of the "Master" in Salesforce.
In the second case, it didn't work. The Salesforce ID on the surviving contact was of the "non-master", so once the merge in Salesforce was done, the contact sync was broken.
Some output (with email addresses obfuscated):

Worked:

Data In:    Salesforce ID's     Email in Salesforce     SF link back to Hubspot
Master      003b000000YsOrWAAV  abc@example             8644276
Non-Master  003b000000fZGTRAA4  def@example2            5476166

Hubspot ID's retrieved from ContactsAPI > HubspotAPI:   SF ID before merging
Master      8644276             abc@example             003b000000YsOrWAAV
Non-Master  5476166             def@example2            003b000000fZGTRAA4
After merge:            SFID
Remains 8644276         003b000000YsOrWAAV
Link is intact as correct ID from Master is still in Hubspot.

Did not work:

Data In:    Salesforce ID's     Email in Salesforce        SF link back to Hubspot
Master      003b000000YsKiyAAF  123@example                6440868
Non-Master  003b000000Ys0IIAAZ  def@example2               156068530

Hubspot ID's retrieved from ContactsAPI > HubspotAPI:      SF ID before merging
Master      6440868     123@example                        003b000000YsKiyAAF
Non-Master  156068530   def@example2                       003b000000Ys0IIAAZ
            
After merge:            SFID
Remains 6440868         003b000000Ys0IIAAZ
Did not work as Salesforce ID remaining is the one that disappeared in Salesforce when the merge was done there.

#6

I don't quite have all the pieces, but as a general rule, whichever record is selected as the master in Salesforce is the "winning" record. and those changes sync back to HubSpot. When emails are the same, this shouldn't create any issues.

However, changing the email also changes the association on the HubSpot contact to the Salesforce record. If you had HubSpot contact X syncing with Salesforce record Y, but you edit Salesforce record Z with a duplicate email address, the HubSpot contact is updated with record Z's Salesforce ID after the sync.

It sounds like you're merging records with different email addresses in Salesforce, and those two separate behaviors are probably conflicting here, for reasons I'm not completely sure how to articulate. I'm not really sure how to proceed, but are those behaviors consistent with your tests? If so, the practical solution (undesirable as it is) may be to manually edit the losing record's email address first, then merge - or accomplish something like that with Apex.


#7

Thanks- so, it's true all our email addresses are different. We don't allow duplicate email addresses in salesforce, but we do get a lot of people ending up in there with a couple of different versions- you know, bob.smith@acme and robert.smith@acme, both being the same contact, and it's these sorts of things we are aiming to merge.
In hubspot, we'll have the same contacts, and they seem to match up in the two examples I gave (i.e. bob.smith maps to bob.smith and robert.smith maps to robert.smith) - it just seems that it'll randomly decide to keep robert instead of bob even though bob was the master?
I'm rather unsure how to proceed with it - if I could spot the pattern where it picks the wrong one, I'd code around it, but I'm not able to figure that out :confused:


#8

Yeah, I understand where you're coming from. The problem is that effectively two contacts are updated in the same poll job during the merge. In addition to being deleted, the email address (and the association to a Salesforce record) changes in the same job, and I'm not sure what takes precedence, either.

As a test (whether or not it's practical), what happens when you change the email address on the losing record to match the one on the master record, then merge?


#9

You mean on the hubspot side, right? I can try that, yes (I can't try it in Salesforce, as I said, we're configured in such a way as to not allow duplicates)
Thanks!


#10

Apologies, I glossed right over that. I was speaking about a Salesforce-side edit. I'm not exactly sure how to proceed. Is removing the email from the losing record, prior to the merge, viable for your config? I have no idea if that would standardize the behavior, but it's the only thing I could think of to try next (until the default behavior for this use case is clarified).