Twenty years ago, Joel Spolsky wrote:

There Ain't No Such Thing As Plain Text.

It does not make sense to have a string without knowing what encoding it uses. You can no longer stick your head in the sand and pretend that β€œplain” text is ASCII.

A lot has changed in 20 years. In 2003, the main question was: what encoding is this?

In 2023, it's no longer a question: with a 98% probability, it's UTF-8. Finally! We can stick our heads in the sand again!

The question now becomes: how do we use UTF-8 correctly? Let's see!

continue reading on tonsky.me

⚠️ This post links to an external website. ⚠️