NIO Question
Hi all,
I'm writting a utilitary class to handle files using the java.nio package. I have a question about how to manipulate large files.
The method bellow reads a binary content of a file.
publicbyte[] readContent(File file)throws IOException, FileNotFoundException{
FileChannel channel =new FileInputStream(file).getChannel();
ByteBuffer buffer = ByteBuffer.allocate((int)file.length());
channel.read(buffer);
buffer.flip();
channel.close();
return buffer.array();
}
My question is about the ByteBuffer.allocate method. Can I have problems when the length of the file exceeds the int limit? I think that the cast operation can be very dangerous...
How can I write a effective code to handle large files?
I have tried like this:
ByteBuffer buffer = ByteBuffer.allocate(1024);
while (reader.read(buffer) != -1){
...
But it sounds java.io and not java.nio. It sounds streams and not blocks.
How should I proceed?
Thank you all.
[1365 byte] By [
Bjornna] at [2007-11-26 12:22:13]

# 1
> Can I have problems when the length of the file
> exceeds the int limit?
Of course. You'll also have problems when the file size exceeds available VM. If the files are arbitrarily large you shouldn't be looking to put them all into memory at all. That's why they are files.
Have a look at FileChannel.map() for another alternative.
Sooner or later you're going to have to deal with buffers smaller than the file.
ejpa at 2007-7-7 15:15:06 >

# 2
Have somebody ever need to handle large files? Can somebody post some sample code to me?
I'm trying to use the filechannel.map but my progress is very slow, and I would like to compare with others to do the "right thing".
The way I show above is a wrong way? What would happen if I handle the files that way? Why the filechannel.map is better? Or worst?
Thanks.
# 3
FileChannel.map() is better because you can map part of the file, and because all the I/O is implicit - you just access the ByteBuffer and it all happens.
It is worse because there's no agreed mechanism for releasing the memory, so you can't handle a large number of MappedByteBuffers in a single run of your program.
There are many other approaches but if the files are large they must all revolve around reading (and writing) parts of the file. Any design where an entire file is read into memory is automatically suspect in my book and needs a correctness proof or an adequacy proof (such as a known limit on the size of the file).
You can see my book http://www.telekinesis.com.au/wipv3_6/FundamentalNetworkingInJava.A21 for an exposition of NIO.
ejpa at 2007-7-7 15:15:06 >

# 4
Hi everybody,
I've a similar doubt.
I'm testing java.io vs java.nio packages to read a large size file. I need to read the entire file (not to load the entire file in memory!).
I noticed that java.nio is 10-15% faster than java.io ( fairly ;) ).
But when I read a very large size file (more then 2 GigaByte) I get the following exception:
Exception in thread "main" java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE
This is the controversial call:
MappedByteBuffer mapBuffer = fileChannel.map(MapMode.READ_ONLY, 0, file.length());
Documentation says:
public abstract MappedByteBuffer map(FileChannel.MapMode mode, long position, long size) throws IOException
while parameters descrtiption says:
size - The size of the region to be mapped; must be non-negative and no greater than Integer.MAX_VALUE
Which is the sense of this?
Do I have to fragment the file to read it?
Thank you all.
# 5
The sense of all this is that you are trying to map the entire file into memory even though you said you don't want to do that. See reply #3Message was edited by: ejp
ejpa at 2007-7-7 15:15:06 >

# 6
I said I don't want to load it, but I need to map it, cause I have to read it entirely.I asked about the sense of accept a long parameter, but it can't have a value greater than an int...
# 7
hi,I'm facing a similar problem..I want operate on huge files whose size can go in GB's.I'm using io for this and have never used nio uptil now..I would like to know which would be faster for operating on such huge files io or nio?And
rudza at 2007-7-7 15:15:06 >

# 8
> hi,
> I'm facing a similar problem..
> I want operate on huge files whose size can go in
> GB's.
> I'm using io for this and have never used nio uptil
> now..
> I would like to know which would be faster for
> operating on such huge files io or nio?
> And how?
>
> thankx
Operate = read / write?
try this:File f = new File("HugeFile.dat");
RandomAccessFile raf = new RandomAccessFile(f, "rw");
FileChannel fileChannel = raf.getChannel();
MappedByteBuffer mappedByteBuffer = fileChannel.map(MapMode.READ_WRITE, 0, f.length());
# 9
> I said I don't want to load it, but I need to map it, because I have to read it entirely.
That doesn't follow. You can read it entirely without using mapping at all.
> I asked about the sense of accept a long parameter,
> but it can't have a value greater than an int...
Because there is only a finite amount of memory available for file mapping.
You need to investigate reading your file a piece at a time. Whether this is 'slower' is irrelevant if the file can be large enough to prevent you using any other technique.
ejpa at 2007-7-7 15:15:06 >

# 10
Thanks ejp,I understood about read my file a piece a time.But again ( 8^P ) ... it accepts a long, but really it want an int... Why about not accept directly an int?
# 11
Good question. I would say that Sun wanted to give themselves the option of raising the limit some time without breaking the binary compatibility of the method signature.
ejpa at 2007-7-7 15:15:06 >

# 12
Ok. I like to believe so ;)Thanks a lot!
# 13
you can try this
FileInputStream fis = new FileInputStream(filename);
FileChannel fc = fis.getChannel();
// Create a read-only CharBuffer on the file
ByteBuffer bbuf = fc.map(FileChannel.MapMode.READ_ONLY, 0, (int)fc.size());
CharBuffer cbuf = Charset.forName("8859_1").newDecoder().decode(bbuf);
return cbuf;
======================================================
use cbuf insted of buffer.array(); it may help
# 14
Who said anything about chars?And that code still maps the entire file, which is what everybody in this old thread has been trying to get away from. Bad idea.
ejpa at 2007-7-7 15:15:06 >
