turakar
Posts: 4
Joined: Fri Dec 09, 2016 8:32 pm

ls unicode characters

Mon Dec 19, 2016 4:20 pm

This is probably a often-asked question, but I cannot get it working.
I'm not using a graphical interface, just the hardware console. I've created a file with a unicode character ('ä') with Windows 7 and saved it to a usb stick. After I've mounted it on my Pi 3 running Raspbian I cd'ed to the directory and did a ls the unicode character is then encoded as a question mark '?'. I then generated locales using raspi-config (de_DE.utf8, de_DE.iso88591, [email protected]) but whenever I try to do LANG=de_DE.theencoding ls I get a question mark.
I want to be able to list the filename with the correct characters in place.

JimmyN
Posts: 1109
Joined: Wed Mar 18, 2015 7:05 pm
Location: Virginia, USA

Re: ls unicode characters

Mon Dec 19, 2016 7:33 pm

If a drive contains unicode characters just add "iocharset=utf8" to the mount command.

User avatar
scruss
Posts: 2420
Joined: Sat Jun 09, 2012 12:25 pm
Location: Toronto, ON
Contact: Website

Re: ls unicode characters

Mon Dec 19, 2016 7:45 pm

Unfortunately, the standard automount on Raspbian sets iocharset=ascii, and I'm not quite sure how one would set those options. It's not in /etc/fstab

I agree with the OP that it is a little off-putting seeing the file in the file manager as ‘testäéïôü.txt’, but showing up in the shell as ‘test�����.txt’
‘Remember the Golden Rule of Selling: “Do not resort to violence.”’ — McGlashan.

User avatar
DougieLawson
Posts: 35838
Joined: Sun Jun 16, 2013 11:19 pm
Location: Basingstoke, UK
Contact: Website Twitter

Re: ls unicode characters

Mon Dec 19, 2016 7:54 pm

The GUI automounter is part of PCManFM, there may be some tweakable options in that.
Note: Having anything humorous in your signature is completely banned on this forum. Wear a tin-foil hat and you'll get a ban.

Any DMs sent on Twitter will be answered next month.

This is a doctor free zone.

User avatar
scruss
Posts: 2420
Joined: Sat Jun 09, 2012 12:25 pm
Location: Toronto, ON
Contact: Website

Re: ls unicode characters

Tue Dec 20, 2016 1:53 am

Okay, I think I've got a working solution:
  1. sudo apt install udisks-glue
  2. edit /etc/udisks-glue.conf (you'll need to use sudo), and make sure that the post_insertion_command line contains --mount-options sync,iocharset=utf8
The next time you insert a USB stick, it should show UTF-8 characters in the terminal.

I note, though, that the system still thinks this has iocharset=ascii in the mount options. I suspect that something, somewhere is a little confused about the status.

(and Dougie, I'm pretty sure that udisks2 catches automounts before PCManFM)
‘Remember the Golden Rule of Selling: “Do not resort to violence.”’ — McGlashan.

User avatar
DougieLawson
Posts: 35838
Joined: Sun Jun 16, 2013 11:19 pm
Location: Basingstoke, UK
Contact: Website Twitter

Re: ls unicode characters

Tue Dec 20, 2016 12:01 pm

scruss wrote: (and Dougie, I'm pretty sure that udisks2 catches automounts before PCManFM)
That's useful to know for folks who use automounters (not me).
Note: Having anything humorous in your signature is completely banned on this forum. Wear a tin-foil hat and you'll get a ban.

Any DMs sent on Twitter will be answered next month.

This is a doctor free zone.

turakar
Posts: 4
Joined: Fri Dec 09, 2016 8:32 pm

Re: ls unicode characters

Wed Dec 21, 2016 1:51 pm

I tried installing the proposed package with the altered config and I tried mounting it manually with the iocharset=utf-8 and iocharset=iso885915, but in both cases I get the question marks. So this does not solve my problem (even if I update my locale to iso...). I'm using auto-mounting to mount inserted USB-sticks for showing the pdf on them for presentation. Therefore I want to be able to support linux and windows files with unicode characters. There is a python library called chardetect which finds out the encoding of a given string. However, the python list function does not deliver escaped characters for the unicode characters, but just a ? which I can parse nothing of. Is it possible to read the raw bytes of a filename? Because then I could design my program in a way that I do not rely on the set encodings.

I did not properly unmount my old Mount. I was able to config my auto-mounter (usbmount) to set the options, so thank you for your support.
Last edited by turakar on Thu Dec 22, 2016 2:45 pm, edited 1 time in total.

User avatar
scruss
Posts: 2420
Joined: Sat Jun 09, 2012 12:25 pm
Location: Toronto, ON
Contact: Website

Re: ls unicode characters

Wed Dec 21, 2016 4:21 pm

hmm, myabe I should be looking at usbmount instead. Are you running standard Raspbian, or have you done things to it?
‘Remember the Golden Rule of Selling: “Do not resort to violence.”’ — McGlashan.

turakar
Posts: 4
Joined: Fri Dec 09, 2016 8:32 pm

Re: ls unicode characters

Thu Dec 22, 2016 2:44 pm

I'm running a raspbian without a graphical interface but with usbmount installed using apt. It's just a sudo apt-get install usbmount ;). The config is inside /etc/usbmount.

Return to “Raspbian”