Estoy usando el siguiente código para convertir caracteres como '\u00c0' al formato Unicode como 'À'
unicode(){ sed -i 's/\\\u00c0/À/g' $1;sed -i 's/\\\u00c1/Á/g' $1;sed -i 's/\\\u00c2/Â/g' $1;sed -i 's/\\\u00c3/Ã/g' $1;sed -i 's/\\\u00c4/Ä/g' $1;sed -i 's/\\\u00c5/Å/g' $1;sed -i 's/\\\u00c6/Æ/g' $1;sed -i 's/\\\u00c7/Ç/g' $1;sed -i 's/\\\u00c8/È/g' $1;sed -i 's/\\\u00c9/É/g' $1;sed -i 's/\\\u00ca/Ê/g' $1;sed -i 's/\\\u00cb/Ë/g' $1;sed -i 's/\\\u00cc/Ì/g' $1;sed -i 's/\\\u00cd/Í/g' $1;sed -i 's/\\\u00ce/Î/g' $1;sed -i 's/\\\u00cf/Ï/g' $1;sed -i 's/\\\u00d0/Ð/g' $1;sed -i 's/\\\u00d1/Ñ/g' $1;sed -i 's/\\\u00d2/Ò/g' $1;sed -i 's/\\\u00d3/Ó/g' $1;sed -i 's/\\\u00d4/Ô/g' $1;sed -i 's/\\\u00d5/Õ/g' $1;sed -i 's/\\\u00d6/Ö/g' $1;sed -i 's/\\\u00d7/×/g' $1;sed -i 's/\\\u00d8/Ø/g' $1;sed -i 's/\\\u00d9/Ù/g' $1;sed -i 's/\\\u00da/Ú/g' $1;sed -i 's/\\\u00db/Û/g' $1;sed -i 's/\\\u00dc/Ü/g' $1;sed -i 's/\\\u00dd/Ý/g' $1;sed -i 's/\\\u00de/Þ/g' $1;sed -i 's/\\\u00df/ß/g' $1;sed -i 's/\\\u00e0/à/g' $1;sed -i 's/\\\u00e1/á/g' $1;sed -i 's/\\\u00e2/â/g' $1;sed -i 's/\\\u00e3/ã/g' $1;sed -i 's/\\\u00e4/ä/g' $1;sed -i 's/\\\u00e5/å/g' $1;sed -i 's/\\\u00e6/æ/g' $1;sed -i 's/\\\u00e7/ç/g' $1;sed -i 's/\\\u00e8/è/g' $1;sed -i 's/\\\u00e9/é/g' $1;sed -i 's/\\\u00ea/ê/g' $1;sed -i 's/\\\u00eb/ë/g' $1;sed -i 's/\\\u00ec/ì/g' $1;sed -i 's/\\\u00ed/í/g' $1;sed -i 's/\\\u00ee/î/g' $1;sed -i 's/\\\u00ef/ï/g' $1;sed -i 's/\\\u00f0/ð/g' $1;sed -i 's/\\\u00f1/ñ/g' $1;sed -i 's/\\\u00f2/ò/g' $1;sed -i 's/\\\u00f3/ó/g' $1;sed -i 's/\\\u00f4/ô/g' $1;sed -i 's/\\\u00f5/õ/g' $1;sed -i 's/\\\u00f6/ö/g' $1;sed -i 's/\\\u00f7/÷/g' $1;sed -i 's/\\\u00f8/ø/g' $1;sed -i 's/\\\u00f9/ù/g' $1;sed -i 's/\\\u00fa/ú/g' $1;sed -i 's/\\\u00fb/û/g' $1;sed -i 's/\\\u00fc/ü/g' $1;sed -i 's/\\\u00fd/ý/g' $1;sed -i 's/\\\u00fe/þ/g' $1;sed -i 's/\\\u00ff/ÿ/g' $1; }
Luego uso unicode file.txt
para convertir a Unicode.
Si tengo un archivo llamado texto_original y tiene una cadena como \u00d8rsted, por ejemplo, ejecutar unicode original_text
convertirá esa cadena a Ørsted
.
Esto funciona muy bien, pero el código parece bastante incorrecto y, en realidad, se ve un poco feo.
Me pregunto, ¿existe una mejor manera de realizar dicha conversión (en Shell o incluso un comando de Unix para convertir dichos caracteres)?
Respuesta1
ascii2uni
deuni2asciipuedo hacer eso.
$ ./ascii2uni -q -a U <<< '\u00d8rsted'
Ørsted