Tuesday, August 28, 2012

Python Unicode, encode to utf-8, etc.


http://www.stereoplex.com/blog/python-unicode-and-unicodedecodeerror


Python, bytes and strings

You've probably noticed that there seems to be a couple of ways of writing down strings in Python. One looks like this:
  'this is a string'
Another looks like this:
  u'this is a string'
There's a good chance that you also know that the second one of those is a Unicode string. But what's the first one? And what does it actually mean to 'be a Unicode string'?




It's worth reiterating that terminology, as you come across it a lot: the transformation from Unicode to an encoding like ASCII is called 'encoding'. The transformation from ASCII back to Unicode is called 'decoding'.

    Unicode  ---- encode ----> ASCII
    ASCII    ---- decode ----> Unicode







No comments:

Post a Comment