Tuesday, January 25, 2011

Python unicode

Python doesn't strip the byte order marker when reading an UTF-8 file. Using unicode.strip() won't remove it from the first line either.

From the link:

if u[0] == unicode( codecs.BOM_UTF8, "utf8" ):
 u = u[1:]

No comments:

Post a Comment