400 Error - Invalid input JSON with special characters

batch
contacts

#1

Currently I’ve got a nightly update script that pushes our user information from our web dashboard to Hubspot in batches of 50. This is done via a cron-job calling my API call built in Node.js/Express. The first 7 batches go through fine, but the 8th is the first one that has a Chinese character in the name field (other European characters such as ïûá also cause the error), and that fails with the following:

What I’m getting is a 400 Error:

{“status”:“error”,“message”:“Invalid input JSON on line 1, column 43763: Unexpected end-of-input: expected close marker for OBJECT (from [Source: java.io.ByteArrayInputStream@24478e9e; line: 1, column: 41836])\n at [Source: java.io.ByteArrayInputStream@24478e9e; line: 1, column: 43763]”,“correlationId”:“42fd4f6c-1575-4786-a6de-5a5f6ed313e2”,“requestId”:“2e1540b8fcd66bbe6c29b713e7860d07”}

The thing is, if I console out the JSON string I’m sending, it parses fine via JSON Lint, and I can even use it via Postman to update without issue. I’m just wondering if anybody else has run into this, and if so, what they did to fix it.

I’ve tried encoding the characters as unicode numeric references, or HTML entities, and they’ll work, but they show up in Hubspot as the literal string values of said references/entities (IE: 漢語拼音 or %E6%BC%A2%E8%AA%9E%E6%8B%BC%E9%9F%B3)

Any help would be appreciated, as our biggest markets are China and Russia.


#2

After hours of searching, I finally found the issue, staring me in the face.

When I set my Content-Length header, I was using the .length property of the JSON string. Unfortunately special characters are counted as a single character, but require more bytes. This caused the Hubspot API to think that I was missing content.

Setting the length using:

Buffer.byteLength(body)

worked a treat.


#3

@gamebenchjake Glad you figured it out! I’m going to take a stab from my academic days. Character variables are signed(as opposed to unsigned) by default. which means anything over 7F would spill into the next byte even if it just represents one character. as 7F would be 0111 1111 so 80 would then spill over and be 0000 000 1000 000 Note that now there is a second byte which would hold the signed bit as the first zero which is why you can’t have just 1000 0000 as that would end up being a negative number and wouldn’t map correctly to a character in the ascii table.

Don’t take my memory for law but I hope it helps clarify why you might be running into that issue with special characters!