writing to a deleted file in linux
Consider this scenario:
1. You have a java app that opens a BufferedWriter on a file. It keeps the file open for its lifetime. Periodically it writes a record to the buffer, then immediately flushes it.
2. While this program is running, you delete the file from the command line.
3. What happens inside your app the next time it tries to write to the file?
If you guessed, "Absolutely nothing at all", congratulations!
It seems that the JVMshould do either of the following instead:
a) Throw an IOException (that's what I would expect); or
b) Quietly re-create the file
I've seen this using Java 1.5 on SUSE 10.0 and a very old version of Red Hat. This appears to be a bug in Sun's JVM. Has anybody else seen this problem?
Here's a bit of code to repro the problem:
import java.io.*;
publicclass FileBug{
publicstaticvoid main(String[] args)throws Exception{
// open the file, write a record
BufferedWriter bw = openFile();
writeRec(bw,"This is the first record...");
// wait here 'til the file is deleted
BufferedReader in =new BufferedReader(new InputStreamReader(System.in));
String str;
System.out.print("Hit any key to continue...");
str = in.readLine();
// write another record
writeRec(bw,"And this is the second record...");
bw.close();
}
static BufferedWriter openFile()throws Exception{
BufferedWriter bw =null;
File f =new File("./test.junk.file");
bw =new BufferedWriter(new OutputStreamWriter(new FileOutputStream(f,true),"UTF-8"));
return bw;
}
staticvoid writeRec(BufferedWriter bw, String rec)throws Exception{
bw.write(rec);
bw.newLine();
bw.flush();
}
}
[2951 byte] By [
mrplanea] at [2007-11-27 1:00:15]

# 1
I assume you know the expected behavior on Unix/Linux, right?
The expected behavior on Unix and Linux when process A writes to a file,
and process B deletes the file, is that A will be able to continue to write to the file...
which will in fact continue to take up more and more disk space on the disk...
And when process A dies, the disk space will be released.
This is at the Operating System level. So there's nothing JVM can do about it.
--
If you're wondering why Unix and Linux treat it this way,
then you need to read up on hard links in Unix and Linux.
Namely, a file inode can be referenced by multiple links.
For example: file1 and file2 may both be hard links to the same file.
At the OS level, you cannot distinguish which is the "main" link and which are extra links.
(Unlike "soft links" or "symbolic links" or "shortcuts" which do have
the distinction of file vs link)
So when process A opens "file1" and write to it,
process B can delete "file1" and that's okay.
=> The aditional contents that process A writes are still readable via "file2".
Now, suppose process C deletes "file2". That's still okay.
=> The additional contents that process A writes are still readable by process A itself (by using file seek)
or can be passed to another process via file-descriptor-passing.
# 2
I have no problem with how the O/S handles this. My concern is that the JVM does not inform my application that all subsequent writes to the file will go to the bit bucket.
> This is at the Operating System level. So there's nothing JVM can do about it.
Not sure I buy this assertion. If I place a call to java.io.File#exists() immediately prior to writing the second record, it returns false. At that point the JVM does know that the file is gone. Yes, my application code could just always call File#exists() before each write, but that seems a bit harsh.
# 3
> I have no problem with how the O/S handles this. My
> concern is that the JVM does not inform my
> application that all subsequent writes to the file
> will go to the bit bucket.
But the file open is on the inode, not on the filename.
On UNIX/Linux, the filename is used initially to locate
the inode; once the inode is found, it is what the OS uses
for the rest of the process life time; the filename can be deleted/renamed/...
And, as I said, the content do not disappear.
They can be read by the same process, or by another process
reading it via a different hard link.
> Not sure I buy this assertion. If I place a call to
> java.io.File#exists() immediately prior to writing
> the second record, it returns false.
I see. So you are only talking about the filename entry.
Note that this is not always correct either.
I can delete the file, then create a new blank file with the same name.
Then the JVM will once again be oblivious to the difference.
Eg, on Unix/Linux, we can do this:
1) Process A opens file1 and start writing to it
2) Process B hardlinks file1 to file2
3) Process C deletes "file1", and create a new blank file "file1"
4) Process A finishes writing 1 megabyte of text.
5) Now, if you try to dump "file1", you get 0 bytes. But if you try to dump "file2", you get 100000 bytes.
# 4
I see what you're saying. Once the final link to the file's inodes is gone, the file system will reclaim those inodes. This seems to be a fairly nasty problem if you're writing critical/sensitive data that simply can't be lost. I guess the obvious solution is, don't delete the file (or at least set its permissions so that it's more difficult to delete). I suppose in a production environment you'd be wise to write to a mirrored file system.
Thanks for the help.