Difference between revisions of "Unicode and Locale"

From Noah.org
Jump to navigationJump to search
(No difference)

Revision as of 16:19, 21 April 2014


Unicode and the shell

When you start Bash usually your LANG environment variable is set for you. The setting usually originates from the /etc/environment file.

Run the `locale` command and notice that it first prints LANG and then a bunch of LC_ variables. The LC_ variables may or may not be set in your environment. If any of them is not set then it automatically takes on the same value as the LANG variable.

locale

Set locale for just a single command

Many commands behave differently depending on the locale. For example, `grep` will interpret range expressions like [a-z] differently depending on the locale. This can cause problems with regular expressions. Generally, most system administration scripts will prefer the C locale.

LANG=C grep 'Search Text' filename

Fix just the sorting order with collate

Some shell commands such as `ls` use a very annoying sort order when LANG=en_US.UTF-8. You can change just the collate order without changing all the other ways the locale could be used. For example, run the following commands and notice the difference. The C collate order is what people might remember and love from the old ASCII (ANSI_X3.4-1968) days.

LC_COLLATE="C" ls -la ~
LC_COLLATE="en_US.UTF-8" ls -la ~

You can set just collate permanently by putting this in your ~/.bashrc:

export LC_COLLATE="C"