GET recently updated contacts is bugged!


#1

GET recently updated contacts is bugged!!
We are using the “GET /contacts/v1/lists/recently_updated/contacts/recent” API (https://developers.hubspot.com/docs/methods/contacts/get_recently_updated_contacts). The documentation specifically says “The response is sorted in descending order by last modified date; the most recently modified record is returned first.”. We are not sure whether this sorting is honoured within a single returned JSON, but we certainly expect each JSON to contain records with last modified date less than or equal to (<=) the earliest last modified date (smallest integer) in the immediately preceding JSON.
We send the first command without a “vid-offset” or “time-offset”, to get the first JSON in the sequence (A). We use the “vid-offset” and “time-offset” from A to get the second JSON (B).
The problem is that we see a record in JSON B with a last modified date which is more recent (greater) than the “time-offset” from A. We think this is a bug in the API. Can anyone help, please?

JSON A
(snip)
“lastmodifieddate”: {
“value”: “1505718420246”
(snip)
“lastmodifieddate”: {
“value”: “1505718412761
(snip)
“has-more”: true,
“vid-offset”: 23071647,
“time-offset”: 1505718412761

JSON B
(snip)
“lastmodifieddate”: {
“value”: “1505718412760”
(snip)
“lastmodifieddate”: {
“value”: “1505718423337<-- greater than JSON A “time-offset”: 1505718412761 –
(snip)
“lastmodifieddate”: {
“value”: “1505718409409”
(snip)
“has-more”: true,
“vid-offset”: 12246912,
“time-offset”: 1505718409409


#2

Hi @kbr95

The recently updated endpoint can return multiple entries for the same contact record, depending on how often that record was updated, so in your example you might see the same record in both responses, if that record was updated twice.

The property values that are included for a record (including the value for lastmodifieddate) will be the current value of those properties, not the values at the time the contact was modified, so in your example, the second record in the results for JSON B would have been modified twice, so it appears in JSON B since it was modified during the period covered in that set of results, but it shows the more recent lastmodifieddate value since it was modified again after that time-offset.

You can check the addedAt field for the record to see when it was modified for each entry it has in the recently updated list.


#3

Many thanks for responding.

Our goal in this integration is to retrieve as little data as possible in order to maintain a reliable cache of records locally (once the system is primed). We have taken a few days to absorb your response and come up with the following points.

Point 1:
On some occasions when there are a lot of updates going on, the JSONs do not follow a descending time sequence.
{“contacts”:[…],“has-more”:true,“vid-offset”:23071647,“time-offset”:1505718412761}
{“contacts”:[…],“has-more”:true,“vid-offset”:12246912,“time-offset”:1505718409409}
{“contacts”:[…],“has-more”:true,“vid-offset”:24240744,“time-offset”:1505718408307}
{“contacts”:[…],“has-more”:true,“vid-offset”:24247948,“time-offset”:1505718407732}
{“contacts”:[…],“has-more”:true,“vid-offset”:28500204,“time-offset”:1505718424226}
{“contacts”:[…],“has-more”:true,“vid-offset”:23071647,“time-offset”:1505718412761}

The extract above represents six JSONs received during a sequence where the fifth one has a time offset which is out of descending order according to the documentation.

“The first request to the endpoint returns the most recent records, and the following requests would contain the next most recent records. The time-offset value in the response is a timestamp that represents the time that the least recent record, in that set of records, was updated or added, so you can use that value to determine when you have all of the changes since the last time you checked for updates.”

Why are the JSONs out of sequence?

Point 2:
“The recently updated endpoint can return multiple entries for the same contact record, depending on how often that record was updated, so in your example you might see the same record in both responses, if that record was updated twice.”

We very rarely see duplicates (except as a result of Point 1, looping behaviour). When we have seen them, a later download (30 minutes later) extending over the same period, does not retrieve the duplicates already seen. We are using the default formSubmissionMode. We stop retrieving as soon as possible and well before the 30 day limit.

Why did the duplicate vids either appear in the first place or disappear subsequently?

Point 3:
“The property values that are included for a record (including the value for lastmodifieddate) will be the current value of those properties, not the values at the time the contact was modified, so in your example, the second record in the results for JSON B would have been modified twice, so it appears in JSON B since it was modified during the period covered in that set of results, but it shows the more recent lastmodifieddate value since it was modified again after that time-offset.”

“You can check the addedAt field for the record to see when it was modified for each entry it has in the recently updated list.”

Having checked 6040 JSONs captures when duplicate vids appear (because of loopng) spanning a two week period, we see no cases where the addedAt and lastModifiedDate differ. This includes the original example which was not the result of duplicate vids. That example did not exhibit the looping behaviour of Point 1, apparently because it was not the last Contact in the file.

Why are the addedAt/lastModifiedDate not in strict descending order?

Point 4.
This is the most important, but is informed by understanding the above points. We have experienced cases where we do not get updates in the recently_updated results but do see the change using the all_contacts. We have confirmed that during a download run, the first record in the JSON is later than the lastModified date of the missing record and that the last record in the final JSON was earlier than the missing record. So we span the time period covering the missing record.

Why do we not see the record in the recently_updated?

Any and all assistance most welcome.


#4

Any help with these issues would be most welcome.