How to encode string to utf-8 in Javascript


utf-8 is used for supporting other languages. It also can be used for avoiding parsing problem when we load JSON object. In this post, I would like to show the easiest way to encode string to utf-8.

Using encodeURI
Link: MDN encodeURI()
It does not encode: A-Z a-z 0-9 ; , / ? : @ & = + $ - _ . ! ~ * ' ( ) #

function encodeToUTF8(x){
 return encodeURI(x);
}


Using encodeURIComponent
Link: MDN encodeURIComponent()
It does not encode: A-Z a-z 0-9 , - _ . ! ~ * ' ( )

function encodeToUTF8(x){
 return encodeURIComponent(x);
}


Using encodeURIComponent and .replace to encode everything to utf-8
Link: GitHub utf-8 encoder.js

// Encode everything to utf-8
// Modified source code from https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent
function fixedEncodeURIComponent(str) {
 return encodeURIComponent(str).replace(/[\-|\.|\!|\~|\*|\'|\(|\)|\w]/g, function(c) {
  return '%' + c.charCodeAt(0).toString(16);
 });
}

You may find old articles that is using escape() for utf-8 encoding. But you should not use escape() according to MDN web docs.
Link: MDN escape()


Good luck working with your data!

No comments:

Post a Comment