A quick look at the Unicode table or the ASCII character table proves that not all characters are meant for us humans to consume. Some of them are control characters – things that tell the computer what to do. Most of the time they go unnoticed, but sometimes they can cause problems, especially if you’re not sure how they got in there in the first place.
This simple regex can be used to strip out the most common control characters (for languages using the Latin alphabet):
.replace(/[\u0000-\u0008,\u000A-\u001F,\u007F-\u00A0]+/g, ”);
You can use the Unicode table to find the character codes to extend it if you need to support other alphabet / character sets (or, rather, exclude characters from those sets).