Looping around an arraylist of strings to find matches(more complicated!)
Total Posts: 1
Help developing an algortim
Posted: 03-05-2006 07:11 AM
Hi,
I have been bashing my head with this for a while, so any help is much appreciated!
I bascially have ONE arraylist which contains filenames as strings (e.g. file.txt, file1.xml, file2.doc, file2.xml, file2.txt)
The arraylist bascially contains a document name with there associated metadata document names. For example file1.doc (document file), file1.xml (metedata file).
There will be siuations where for example file1.doc may not have an associated file1.xml file with it. And there may be situations where there are 2 documents and only one associated metadata file (e.g. file.doc, file1.txt, file1.xml)
I basiclaly need to loop arond the arraylist identifying only files names that:
- have an associated metadatafile (once identified put the document name in an arraylist of matches)
-this arraylist cannot contain duplicate documents with an associated file name (e.g. file.doc, file1.txt, file1.xml), if this particualr situations occurs I want to put these filenames in an arraylist of errors.
Please help! thanks in advance
[1180 byte] By [
delboya] at [2007-10-2 14:00:57]

I'm a little lost as to how
>(e.g. file.doc, file1.txt, file1.xml)
contains duplicate documents with an associated filename. Either the .txt and .xml are both metadata files, and you are saying that there should never exist two metadata documents of the same name, or else you meant to say
>(e.g. file1.doc, file1.txt, file1.xml)
in which case you meant that two data files (the .doc and the .txt) with the same name (file1) cannot exist together if there exists a metadata .xml of the same name.
Assuming the latter is true, try something like this: (hot off the grill ;)
import java.util.*;
public class Blah {
public static void main(String[] args) {
List<String> fileNames = Arrays.asList( new String[] {
"File1.txt", "File2.txt", "File1.xml", "File1.doc", "File1.blah", "File3.doc",
"File4.xml", "File4.txt", "File5.doc", "File6.doc", "File6.xml", "File7.csv",
"File7.xsl", "File7.xml", "File8.xml", "File9.doc", "File90.txt"
} );
String metadataExtension = "xml";
Map<String, Set><String>> pendingFileNames = new TreeMap<String, Set><String>>();
Set<String> metadataNamesFound = new TreeSet<String>();
Map<String, String> verifiedMetadataFileNamePair = new TreeMap<String, String>();
Map<String, Set><String>> verifiedBadFileNames = new TreeMap<String, Set><String>>();
Set<String> verifiedFileNamesWithoutMetadata = new TreeSet<String>();
Set<String> verifiedMetadataWithoutFileName = new TreeSet<String>();
for (String fileName: fileNames) {
String[] halves = fileName.split("\\.");
System.out.println(halves[0]+" "+halves[1]);
String nameOnly = halves[0];
String extension = halves[1];
Set<String> associatedExtensions = pendingFileNames.get(nameOnly);
if (associatedExtensions == null) {
associatedExtensions = new TreeSet<String>();
pendingFileNames.put(nameOnly, associatedExtensions);
}
if (extension.equals(metadataExtension)) {
metadataNamesFound.add(nameOnly);
} else {
associatedExtensions.add(extension);
}
}
for (String name: pendingFileNames.keySet()) {
Set<String> fileExtensions = pendingFileNames.get(name);
boolean metaNameWasFound = metadataNamesFound.contains(name);
String fullMetaName = name+"."+metadataExtension;
if (fileExtensions == null || fileExtensions.size() == 0) {
if (metaNameWasFound)
verifiedMetadataWithoutFileName.add( fullMetaName );
} else if (fileExtensions.size() == 1) {
String fullFileName = name+"."+fileExtensions.iterator().next();
if (metaNameWasFound)
verifiedMetadataFileNamePair.put(fullFileName, fullMetaName);
else
verifiedFileNamesWithoutMetadata.add(fullFileName);
} else {
if (metaNameWasFound)
verifiedBadFileNames.put( name, fileExtensions );
else {
for(String extName: fileExtensions) {
verifiedFileNamesWithoutMetadata.add(name+"."+extName);
}
}
}
}
System.out.println();
System.out.println();
// System.out.println(pendingFileNames);
System.out.println("Verified one-to-one file/metadata pairs: \n"+verifiedMetadataFileNamePair);
System.out.println("\nVerified errors (more than one document to one metadata): \n"+verifiedBadFileNames);
System.out.println("\nVerified metadataless files: \n"+verifiedFileNamesWithoutMetadata);
System.out.println("\nVerified fileless metadata: \n"+verifiedMetadataWithoutFileName);
}
}