Inputting UTF-8 characters into the Bash CLI

Inputting UTF-8 characters into the Bash CLI

I've got a directory listing with files like:

drwxr-xr-x   2 nobody nogroup       4096 2011-01-11 21:06 Капкан
drwxr-xr-x   3 nobody nogroup       4096 2011-11-17 08:40 СБПЧ

When I copy/paste or directly input such filenames into the prompt, I expect to be able to work with them just like Latin-1 letters.

Instead, I get results like:

# Pasting "Капкан"
$ :?апкан 

You see the first letter is replaced by ":?". Then, I am not able to traverse the characters to the left except by deleting them. Keyboard-based input produces the same results. tmux or screen yank/paste produces the same results.

I'm not sure how to diagnose this situation! This is a pretty old Debian distro ($ uname -a Linux weezy 2.6.37.6.RNx86_32.1.4 #1 Thu Jul 26 04:49:29 PDT 2012 i686 GNU/Linux), but still I'd think that there's a way to get UTF-8 filenames behaving as I'd expect.

I am using OS X's Terminal.app and TERM is set to xterm-color.

I do believe these filenames are encoded using UTF-8. I am using Bash 3.1.17, and here is my output from locale:

$ locale
LANG=
LC_CTYPE="en_US.utf8"
LC_NUMERIC="en_US.utf8"
LC_TIME="en_US.utf8"
LC_COLLATE="en_US.utf8"
LC_MONETARY="en_US.utf8"
LC_MESSAGES="en_US.utf8"
LC_PAPER="en_US.utf8"
LC_NAME="en_US.utf8"
LC_ADDRESS="en_US.utf8"
LC_TELEPHONE="en_US.utf8"
LC_MEASUREMENT="en_US.utf8"
LC_IDENTIFICATION="en_US.utf8"
LC_ALL=en_US.utf8

Here's my output from $ locale -a:

C
de_DE.utf8
en_US.utf8
ja_JP.utf8
ko_KR.utf8
nl_NL.utf8
POSIX
zh_CN.utf8
zh_TW.utf8

I would consider installing ru_RU.UTF-8 from /etc/locale.gen but I'm having trouble with basic Latin characters too, like á.

Antwort1

As per comments above twiddling the Terminal.app Escape non-ASCII characters setting appears to have solved the problem. Though I am left confused at to the direction in which the twiddling was necessary to make this work correctly.

verwandte Informationen