SSH - -bash: cannot create temp file for here-document:


17 posts
by Rilhas » Sun Aug 12, 2012 1:59 pm
Hi,

I'm having a problem with my SSH server where after a few days on it becomes unable to perform TAB-completion. It still accepts connections and allows me to do anything except use the TAB-completion. So, every time I press TAB to complete a file name or path it fails with the message "-bash: cannot create temp file for here-document: No space left on device". My client is PuTTY in Windows.

I executed "df -h" with the following results:

Filesystem Size Used Avail Use% Mounted on
rootfs 15G 2.9G 11G 21% /
/dev/root 15G 2.9G 11G 21% /
tmpfs 19M 216K 19M 2% /run
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 37M 37M 0 100% /tmp
tmpfs 10M 0 10M 0% /dev
tmpfs 37M 4.0K 37M 1% /run/shm
/dev/mmcblk0p1 56M 34M 23M 61% /boot
/dev/sda1 459G 121G 315G 28% /media/RPIDISK

So it seems “tempfs” at “/tmp” is full. But why? Who created it? And why is it filling up? Rebooting empties it, but I want to be able to run the PI for at least a month without rebooting it.

pi@raspberrypi /media/RPIDISK/app $ ls /tmp
ssh-pHfXcEjtkFzS
pi@raspberrypi /media/RPIDISK/app $ ls /tmp/ssh-pHfXcEjtkFzS
agent.1588

Guessing that "agent.1588" might relate to PID 1588:

pi@raspberrypi /media/RPIDISK/app $ ps auxww
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND

pi 1588 0.0 0.3 12752 664 ? Ssl 04:17 0:00 /usr/bin/lxsession -s LXDE -e LXDE


Any ideas on how I can solve this? I thought of increasing the size of the “/tmp” (to last at least a month) but I don’t know where it is being mounted (it is not in “fstab”).
Posts: 20
Joined: Sat Aug 04, 2012 7:52 am
by jojopi » Sun Aug 12, 2012 2:23 pm
Rilhas wrote:pi@raspberrypi /media/RPIDISK/app $ ls /tmp
ssh-pHfXcEjtkFzS
pi@raspberrypi /media/RPIDISK/app $ ls /tmp/ssh-pHfXcEjtkFzS
agent.1588
That file is a socket. "ls -l" will show you that it is zero bytes, so it is not the problem. Whatever file(s) are taking up 37M in /tmp are either "hidden" (name starts with a dot) or deleted.

Does "ls -laR /tmp" show any large files?
User avatar
Posts: 2013
Joined: Tue Oct 11, 2011 8:38 pm
by Rilhas » Sun Aug 12, 2012 2:35 pm
jojopi wrote:Does "ls -laR /tmp" show any large files?


Thanks for your quick reply. Unfortunately I was getting a little annoyed so I rebooted it, so the problem is no longer present. Anyway, invoking the command you suggested:

pi@raspberrypi /media/RPIDISK/app $ ls -laR /tmp
/tmp:
total 8
drwxrwxrwt 5 root root 160 Aug 12 15:28 .
drwxr-xr-x 26 root root 4096 Aug 7 22:05 ..
drwxrwxrwt 2 root root 40 Aug 12 15:17 .ICE-unix
srwxr-xr-x 1 pi pi 0 Aug 12 15:18 .menu-cached-:0-pi
srwxr-xr-x 1 pi pi 0 Aug 12 15:18 .pcmanfm-socket--0-pi
drwx------ 2 pi pi 60 Aug 12 15:17 ssh-n9FTQr1uCQWh
-r--r--r-- 1 root root 11 Aug 12 15:17 .X0-lock
drwxrwxrwt 2 root root 60 Aug 12 15:17 .X11-unix

/tmp/.ICE-unix:
total 0
drwxrwxrwt 2 root root 40 Aug 12 15:17 .
drwxrwxrwt 5 root root 160 Aug 12 15:28 ..

/tmp/ssh-n9FTQr1uCQWh:
total 0
drwx------ 2 pi pi 60 Aug 12 15:17 .
drwxrwxrwt 5 root root 160 Aug 12 15:28 ..
srw------- 1 pi pi 0 Aug 12 15:17 agent.1591

/tmp/.X11-unix:
total 0
drwxrwxrwt 2 root root 60 Aug 12 15:17 .
drwxrwxrwt 5 root root 160 Aug 12 15:28 ..
srwxrwxrwx 1 root root 0 Aug 12 15:17 X0

I don't really know what to make of that output, but if nothing looks strange to you then I'll wait for the next time the problem manifests itself, issue the command then, and after that come back and post the result.

Thanks!
Posts: 20
Joined: Sat Aug 04, 2012 7:52 am
by Rilhas » Tue Aug 14, 2012 8:18 pm
jojopi wrote:Does "ls -laR /tmp" show any large files?


Hi,

The problem is manifesting itself again.

Code: Select all
pi@raspberrypi /media/RPIDISK/app $ ls H<TAB pressed>-bash: cannot create temp file for here-document: No space left on device
-bash: cannot create temp file for here-document: No space left on device

And here is the current disk usage:

Code: Select all
pi@raspberrypi /media/RPIDISK/app $ df -h
Filesystem      Size  Used Avail Use% Mounted on
rootfs           15G  3.3G   11G  25% /
/dev/root        15G  3.3G   11G  25% /
tmpfs            19M  216K   19M   2% /run
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs            37M   37M     0 100% /tmp
tmpfs            10M     0   10M   0% /dev
tmpfs            37M  4.0K   37M   1% /run/shm
/dev/mmcblk0p1   56M   34M   23M  61% /boot
/dev/sda1       459G  121G  315G  28% /media/RPIDISK
pi@raspberrypi /media/RPIDISK/app $

I tried the command you mentioned:

Code: Select all
pi@raspberrypi /media/RPIDISK/app $ ls -laR /tmp
/tmp:
total 8
drwxrwxrwt  5 root root  160 Aug 14 21:07 .
drwxr-xr-x 26 root root 4096 Aug  7 22:05 ..
drwxrwxrwt  2 root root   40 Aug 12 15:17 .ICE-unix
srwxr-xr-x  1 pi   pi      0 Aug 12 15:18 .menu-cached-:0-pi
srwxr-xr-x  1 pi   pi      0 Aug 12 15:18 .pcmanfm-socket--0-pi
drwx------  2 pi   pi     60 Aug 12 15:17 ssh-n9FTQr1uCQWh
-r--r--r--  1 root root   11 Aug 12 15:17 .X0-lock
drwxrwxrwt  2 root root   60 Aug 12 15:17 .X11-unix

/tmp/.ICE-unix:
total 0
drwxrwxrwt 2 root root  40 Aug 12 15:17 .
drwxrwxrwt 5 root root 160 Aug 14 21:07 ..

/tmp/ssh-n9FTQr1uCQWh:
total 0
drwx------ 2 pi   pi    60 Aug 12 15:17 .
drwxrwxrwt 5 root root 160 Aug 14 21:07 ..
srw------- 1 pi   pi     0 Aug 12 15:17 agent.1591

/tmp/.X11-unix:
total 0
drwxrwxrwt 2 root root  60 Aug 12 15:17 .
drwxrwxrwt 5 root root 160 Aug 14 21:07 ..
srwxrwxrwx 1 root root   0 Aug 12 15:17 X0
pi@raspberrypi /media/RPIDISK/app $

I don't see any large files there... any ideas? (this time I'm not rebooting the PI!)

Thanks!
Posts: 20
Joined: Sat Aug 04, 2012 7:52 am
by ecw » Tue Aug 14, 2012 9:08 pm
Run
lsof -a | grep tmp
Look for any file on the right surrounded by parentheses. It's likely a process is holding an unlinked file open, lsof will show this.

Also, try
sudo fuser /tmp
to find processes using resources in /tmp.
Posts: 10
Joined: Tue Jan 10, 2012 1:56 pm
by jojopi » Tue Aug 14, 2012 9:18 pm
Possibly some deleted file in /tmp cannot be freed because a process still has it open. Try the following command (and I apologize for the complexity of this):
Code: Select all
sudo sh -c "ls -ld /proc/*/fd/*" 2>&1 |grep /tmp |grep deleted
If this produces any output, such as:
Code: Select all
lr-x------ 1 pi   pi   64 Aug 14 21:23 /proc/23628/fd/0 -> /tmp/Z (deleted)
Then look up each of the processes by the process id between /proc/ and /fd/, here 23628, using "ps up 23628":
Code: Select all
pi@pi ~ $ ps up 23628
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
pi       23628  0.0  0.2   1648   436 pts/0    S    21:22   0:00 sleep 100000
Here we see "/tmp/Z" cannot be freed because "sleep" still has it open. This can be fixed with "kill 23628", but only when we know what the process and file are for.
User avatar
Posts: 2013
Joined: Tue Oct 11, 2011 8:38 pm
by elatllat » Tue Aug 14, 2012 9:21 pm
Code: Select all
du -csh /tmp/*
Posts: 1050
Joined: Sat Dec 17, 2011 5:05 pm
by Rilhas » Tue Aug 14, 2012 10:56 pm
ecw wrote:Run
lsof -a | grep tmp


Code: Select all
pi@raspberrypi /media/RPIDISK/app $ lsof -a | grep tmp
-bash: lsof: command not found
pi@raspberrypi /media/RPIDISK/app $ sudo apt-get install laof
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package laof
pi@raspberrypi /media/RPIDISK/app $


I googled "laof":

Code: Select all
LAOF
TRANSBA S.A - Medición para localización de fallas. - Reparación de vainas de CAS. - Ensayos de fallas reparadas. - Relevamientos, zanjeos, tapadas ...

Contacto - Productos - Prensa - ClientesUrban Dictionary: laofwww.urbandictionary.com/define.php?...Em cache - Traduzir esta páginaCompartilhar
a chunk of most typed of baked goods, cause for celebration. brings joy to all.

Urban Dictionary: laof t-shirts, mugs and magnetswww.urbandictionary.com/products.ph...Em cache - ... with your own definition! by teffomymina. Definition: a chunk of most typed of baked goods, cause for celebration. brings joy to all. Example: i'm cutting the laof!


... I guess that is not it!

ecw wrote:Run
sudo fuser /tmp


Code: Select all
pi@raspberrypi /Temp $ sudo fuser /tmp
pi@raspberrypi /Temp $
pi@raspberrypi /Temp $ df -h
Filesystem      Size  Used Avail Use% Mounted on
rootfs           15G  3.3G   11G  25% /
/dev/root        15G  3.3G   11G  25% /
tmpfs            19M  216K   19M   2% /run
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs            37M   37M     0 100% /tmp
tmpfs            10M     0   10M   0% /dev
tmpfs            37M  4.0K   37M   1% /run/shm
/dev/mmcblk0p1   56M   34M   23M  61% /boot
/dev/sda1       459G  121G  315G  28% /media/RPIDISK
pi@raspberrypi /Temp $


Nopthing happened... what did the command do?
Posts: 20
Joined: Sat Aug 04, 2012 7:52 am
by jojopi » Tue Aug 14, 2012 11:05 pm
Rilhas wrote:-bash: lsof: command not found
pi@raspberrypi /media/RPIDISK/app $ sudo apt-get install laof
sudo apt-get install lsof
lsof |grep /tmp |grep deleted
User avatar
Posts: 2013
Joined: Tue Oct 11, 2011 8:38 pm
by Rilhas » Tue Aug 14, 2012 11:41 pm
jojopi wrote:sudo sh -c "ls -ld /proc/*/fd/*" 2>&1 |grep /tmp |grep deleted

Code: Select all
pi@raspberrypi /Temp $ sudo sh -c "ls -ld /proc/*/fd/*" 2>&1 |grep /tmp |grep deleted
lrwx------ 1 pi   pi   64 Aug 14 23:53 /proc/1652/fd/2 -> /tmp/tmpfM6EBoC (deleted)
lrwx------ 1 pi   pi   64 Aug 14 23:53 /proc/1654/fd/2 -> /tmp/tmpfM6EBoC (deleted)
lrwx------ 1 pi   pi   64 Aug 14 23:53 /proc/1666/fd/2 -> /tmp/tmpfM6EBoC (deleted)
lrwx------ 1 pi   pi   64 Aug 14 23:53 /proc/1680/fd/2 -> /tmp/tmpfM6EBoC (deleted)
lrwx------ 1 pi   pi   64 Aug 14 23:53 /proc/24802/fd/2 -> /tmp/tmpfM6EBoC (deleted)
pi@raspberrypi /Temp $ ps up 1652
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
pi        1652  0.0  0.3   4220   716 ?        S    Aug12   0:00 /bin/bash HTTP_Relay.sh
pi@raspberrypi /Temp $ ps up 1654
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
pi        1654  0.2  1.5 138172  2824 ?        Sl   Aug12   7:10 ./HTTP_Relay_RPI_CON.exe
pi@raspberrypi /Temp $ ps up 1666
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
pi        1666  0.0  0.4   4240   780 ?        S    Aug12   0:00 /bin/bash HTML_Server.sh
pi@raspberrypi /Temp $ ps up 1680
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
pi        1680  4.7  0.5   4280  1080 ?        S    Aug12 162:40 /bin/bash daily_scheduler.sh
pi@raspberrypi /Temp $ ps up 24802
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
pi       24802  1.2  9.3 140968 17524 ?        Sl   Aug12  42:48 ./HTML_Server_RPI_CON.exe
pi@raspberrypi /Temp $


The processes are all mine. But I don't create any files in "/tmp", so why are my processes filling it up?
Posts: 20
Joined: Sat Aug 04, 2012 7:52 am
by Rilhas » Tue Aug 14, 2012 11:50 pm
Rilhas wrote:
Code: Select all
pi@raspberrypi /media/RPIDISK/app $ lsof -a | grep tmp
-bash: lsof: command not found
pi@raspberrypi /media/RPIDISK/app $ sudo apt-get install laof
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package laof
pi@raspberrypi /media/RPIDISK/app $



Detected the typo and tried again:

Code: Select all
pi@raspberrypi /Temp $ sudo apt-get install lsof
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
  lsof
0 upgraded, 1 newly installed, 0 to remove and 28 not upgraded.
Need to get 321 kB of archives.
After this operation, 474 kB of additional disk space will be used.
Get:1 http://mirrordirector.raspbian.org/raspbian/ wheezy/main lsof armhf 4.86+dfsg-1 [321 kB]
Fetched 321 kB in 0s (410 kB/s)
Selecting previously unselected package lsof.
(Reading database ... 54573 files and directories currently installed.)
Unpacking lsof (from .../lsof_4.86+dfsg-1_armhf.deb) ...
Processing triggers for man-db ...
Setting up lsof (4.86+dfsg-1) ...
pi@raspberrypi /Temp $ lsof -a | grep tmp
lsof: no select options to AND via -a
lsof 4.86
 latest revision: ftp://lsof.itap.purdue.edu/pub/tools/unix/lsof/
 latest FAQ: ftp://lsof.itap.purdue.edu/pub/tools/unix/lsof/FAQ
 latest man page: ftp://lsof.itap.purdue.edu/pub/tools/unix/lsof/lsof_man
 usage: [-?abhKlnNoOPRtUvVX] [+|-c c] [+|-d s] [+D D] [+|-f[gG]] [+|-e s]
 [-F [f]] [-g [s]] [-i [i]] [+|-L [l]] [+m [m]] [+|-M] [-o [o]] [-p s]
[+|-r [t]] [-s [p:s]] [-S [t]] [-T [t]] [-u s] [+|-w] [-x [fl]] [--] [names]
Use the ``-h'' option to get more help information.
pi@raspberrypi /Temp $


I did try the help, but I was unable to extract any meaning from it given that I don't even know what it is supposed to do... as far as I can understand I think "-a" is a valid option, but it seems to want to AND that with something...
Posts: 20
Joined: Sat Aug 04, 2012 7:52 am
by jojopi » Wed Aug 15, 2012 12:28 am
Rilhas wrote:The processes are all mine. But I don't create any files in "/tmp", so why are my processes filling it up?
Interesting. They are all using the same deleted file for fd/2, which is stderr, the stream that UNIX programs conventionally write their error messages to. To see what they have written to the file, try "less /proc/1652/fd/2" where 1652 is any of the processes that still exist.

But the problem is not so much that they have written a lot of error messages, rather that they have been writing to a file that is deleted. That means that sooner or later the filesystem has got to fill up, even if the errors come very slowly.

The fact that all these processes share the same randomly-named file as stderr means it must have been opened (and most likely deleted) by the same parent process. Can you determine a single parent service that has started all of the problem processes?
User avatar
Posts: 2013
Joined: Tue Oct 11, 2011 8:38 pm
by Rilhas » Wed Aug 15, 2012 12:57 am
jojopi wrote:Interesting. They are all using the same deleted file for fd/2, which is stderr, the stream that UNIX programs conventionally write their error messages to. To see what they have written to the file, try "less /proc/1652/fd/2" where 1652 is any of the processes that still exist.


Wow!!!

What happens is that I have an “autoexec.sh” script start from cron on boot. What that does is start a number of other scripts, with the output/error of each redirected to a log file (a different log file for every script). The log files are created in the scripts’ directory, and I can see they are there and that they are being written to.

One of the scripts seems to be out of control and keeps on writing to its log file (redirected output/error), which is now 55MB. I’ll have to fix it because this script is supposed to execute only once at 4 A.M and it is not waiting for that time at all. So that has to be fixed.

But how does that affect the bash’s ability to function properly? Can redirection of output/error streams cause problems to other processes in Linux? Should I not redirect the output? ... I guess fixing the out-of-control-script would stop the problem, but, conceptually, is redirecting something that I should not do? This architecture was inherited from the PI’s predecessor, a computer with much less resources where the only way for me to know what was happening would be through verbose log files. I just copied all the scripts and philosophy to the PI’s disk.

jojopi wrote:But the problem is not that they have written a lot of error messages, rather that they have been writing to a file that is deleted. That means that sooner or later the filesystem has got to fill up, even if the errors come very slowly.


I don’t know who deleted them, and I don’t understand how they get stuck after deletion. I imagine that something deleted them but while the scripts are running and their outputs/errors are being redirected then the actual files are not purged from the file system and remain in existence, thus still taking up space. Do I understand correctly? If so, then the “deletor” must be some system cleanup process because I sure didn’t delete them!

jojopi wrote:The fact that all these processes share the same randomly-named file as stderr means it must have been opened (and most likely deleted) by the same parent process. Can you determine a single parent service that has started all of the problem processes?


I think you got it right, this all comes from the initial “autoexec.sh” script. However, I don’t really fully understand because each started script redirects to its own file. The files are there for each script, do you mean there is a “copy” of them taking up space in the “/tmp” directory? Or do you mean “stderr” as in them all having the same name even though they are different files on disk?

If I fix the runaway script I guess the amount of information generated daily will surely not come close the 37MB of space the “/tmp” has available for my targeted 1 month uptime (between reboots) for the PI. But I only feel comfortable in doing this if the technique of redirecting output/error is conceptually sound. Is it?

Thanks,
Rogerio
Posts: 20
Joined: Sat Aug 04, 2012 7:52 am
by elatllat » Tue Aug 21, 2012 5:12 pm
Linux will fail (to boot evan) if the main partitions are full (thus ext4 has a default buffer to prevent users from doing this).

In ext4 files are only released from the files system if they have a lock count of 0 and a (hard) link count of 0.

It's generally bad practice to keep a lock on a file more than needed for this and other reasons.

if you append
Code: Select all
>/dev/null 2>&1

to your script commands it should silence them.

Is autoexec.sh some kind of hacky replacement for crontab?
Posts: 1050
Joined: Sat Dec 17, 2011 5:05 pm
by Rilhas » Tue Aug 21, 2012 5:55 pm
Thanks for the explanation!

elatllat wrote:Is autoexec.sh some kind of hacky replacement for crontab?


It is not really intended as a hack. Its primary purpose is to provide an OS-independent startup sequence. Since I started messing with startup sequences back in the old MS-DOS days what beter name than autoexec? :)

The operative word is "sequence". The PI's predecessor had very little resources, so I had to make sure that startup tasks would not overlap. I don't know of any way for cron to start multiple tasks in sequence, making sure task N has finished before starting task N+1, but as far as I understand it it would actually subvert the whole point of cron. Having a script start things in sequence was for me very intuitive and easy to implement.

Now with the PI I could start everything at once, but I still like the idea of having one centralized script where everything starts, because it helps when restoring a fully functional system from scratch by just copying files, so I kept it. Now cron just starts one single task at reboot, the autoexec.sh.
Posts: 20
Joined: Sat Aug 04, 2012 7:52 am
by elatllat » Thu Aug 23, 2012 10:42 pm
Rilhas wrote:...autoexec.sh .. primary purpose is to provide an OS-independent startup sequence


As I don't know what's in said file I can only say that It should likely exist as a bunch of services in /etc/init.d/, or called from /etc/rc.local.
Code: Select all
man update-rc.d

Unless it's pure interpreted language(java,perl,python,etc) it's not "OS-independent".

service example:
viewtopic.php?p=155377#p155377
Posts: 1050
Joined: Sat Dec 17, 2011 5:05 pm
by Rilhas » Sun Aug 26, 2012 1:52 pm
elatllat wrote:As I don't know what's in said file I can only say that It should likely exist as a bunch of services in /etc/init.d/, or called from /etc/rc.local.


The autoexec file just contains a sequence of "start this", "start that", "wait a little", "start that other thing", etc. So, at a first glance it might seem that putting all this in "init.d" would be more appropriate.

However, even getting autoexec to run on startup was a problem on its own, because the disk where everything is is connected through USB, and so I spent about 3 days trying all forms and sizes of "init.d" variations to get things launched at boot time http://www.raspberrypi.org/phpBB3/viewtopic.php?p=142242#p142242. In the process I realized that using "init.d" extensively would not be as portable as using a script for that because script interpretation on Linux has many less variants that "init.d". So I concluded that scripts are more "OS independent" than "init.d". What I mean is that if my setup depends a lot on what version of Linux it is running on then it is not really OS Independent.

An additional problem of using "init.d" is the already mentioned inability to make sure that things don't get started simultaneously. I like the idea that I can go back to less powerful computers than the Pi, as it was with the Pi's predecessor. Starting things from "init.d" would make this benefit disappear without any obvious advantage over "autoexec.sh".

The other thing is that the second important thing - that of restoring a system. It is much easier to restore a system where you just need configure the system to call one "autoexec.sh" file than one where you need to configure the system to call 10 things. Not to mention how much easier it is to edit the sequence, adding something, or disabling something else. Whell, at least that is how it feels from a newb's perspective like myself! :)
Posts: 20
Joined: Sat Aug 04, 2012 7:52 am