Which is better ListIterator or Simple list.size() in for loop

Hi everybody

I have one scenario in which, I need to delete the duplicate values from the list (Note : Duplicate values are coming in sequence).

I tried to approaches as follows

1) Using ListIterator

I iterated all values and if I found any duplicate then I called remove() method of Iterator.

2) I made another ArrrayList object, and iterated the old list using size() and for loop, and if I found unique value then I added that value to the new ArrayList.

I also created one test java file to find out the performance in both cases. But I am not pretty sure that this test is correct or not. Please suggest me which approach is correct.

code For Test

publicclass TestReadonly{

publicstaticvoid main(String[] args){

List list =new ArrayList();

long beforeHeap = 0,afterHeap = 0;

long beforeTime = 0,afterTime = 0;

addElementsToList(list);

Collections.sort(list);

callGC();

beforeHeap = Runtime.getRuntime().freeMemory();

beforeTime = System.currentTimeMillis();

System.out.println(" Before "+System.currentTimeMillis()+" List Size "+list.size()+" heap Size "+beforeHeap);

new TestReadonly().deleteDuplicated1(list);

afterHeap = Runtime.getRuntime().freeMemory();

afterTime = System.currentTimeMillis();

System.out.println(" After "+System.currentTimeMillis()+" List Size "+list.size()+" heap Size "+afterHeap);

System.out.println(" Time Differance "+(afterTime-beforeTime)+" Heap Differance "+(afterHeap-beforeHeap));

list.clear();

addElementsToList(list);

Collections.sort(list);

callGC();

beforeHeap = Runtime.getRuntime().freeMemory();

beforeTime = System.currentTimeMillis();

System.out.println(" Before "+System.currentTimeMillis()+" List Size "+list.size()+" heap Size "+beforeHeap);

list =new TestReadonly().deleteDuplicated2(list);

afterHeap = Runtime.getRuntime().freeMemory();

afterTime = System.currentTimeMillis();

System.out.println(" After "+System.currentTimeMillis()+" List Size "+list.size()+" heap Size "+afterHeap);

System.out.println(" Time Differance "+(afterTime-beforeTime)+" Heap Differance "+(afterHeap-beforeHeap));

}

/**

* @param list

*/

privatestaticvoid addElementsToList(List list){

for(int i=0;i<1000000;i++){

list.add("List Object"+i);

}

for(int i=0;i<10000;i++){

list.add("List Object"+i);

}

}

privatestaticvoid callGC(){

Runtime.getRuntime().gc();

Runtime.getRuntime().gc();

Runtime.getRuntime().gc();

Runtime.getRuntime().gc();

Runtime.getRuntime().gc();

}

privatevoid deleteDuplicated1(List employeeList){

String BLANK ="";

String currentEmployeeNumber =null;

String previousEmployeeNumber =null;

ListIterator iterator = employeeList.listIterator();

while (iterator.hasNext()){

previousEmployeeNumber = currentEmployeeNumber;

currentEmployeeNumber = (String) iterator.next();

if ((currentEmployeeNumber.equals(previousEmployeeNumber))

|| ((BLANK.equals(currentEmployeeNumber) && BLANK

.equals(previousEmployeeNumber)))){

iterator.remove();

}

}

}

private List deleteDuplicated2(List employeeList){

String BLANK ="";

String currentEmployeeNumber =null;

String previousEmployeeNumber =null;

List l1 =new ArrayList(employeeList.size());

for (int i =0;i<employeeList.size();i++){

previousEmployeeNumber = currentEmployeeNumber;

currentEmployeeNumber = (String) employeeList.get(i);

if (!(currentEmployeeNumber.equals(previousEmployeeNumber))

|| ((BLANK.equals(currentEmployeeNumber) && BLANK

.equals(previousEmployeeNumber)))){

l1.add(currentEmployeeNumber);

}

}

return l1;

}

}

Output

Before 1179384331873 List Size 1010000 heap Size 60739664

After 1179384365545 List Size 1000000 heap Size 60737600

Time Differance 33672 Heap Differance -2064

Before 1179384367545 List Size 1010000 heap Size 60739584

After 1179384367639 List Size 1000000 heap Size 56697504

Time Differance 94 Heap Differance -4042080>

[6649 byte] By [shinde_yogesha] at [2007-11-27 4:38:29]
# 1
Both approaches are correct. Have you checked how your algorithms behave when you pass a LinkedList instead of an ArrayList?You didn't try the obvious solution: using a Set instead of a List. Sets remove duplicates automatically.
jsalonena at 2007-7-12 9:48:58 > top of Java-index,Java Essentials,Java Programming...
# 2

I think, you test is ok like that. Although I would have tested with two different applications, just to be shure that the heap is clean. You never know what gc() actually does.

Still, your results show what is expected:

Approach 1 (List iterator) takes virtually no extra memory, but takes a lot of time, since the list has to be rebuild after each remove.

Approach 2 is much faster, but takes a lot of extra memory, since you need a "copy" of the original list.

Basically, both approaches are valid. You have to decide depending on your requirements.

Approach 1 can be optimized by using a LinkedList instead of an ArrayList. When you remove an element from an ArrayList, all following elements have to be shifted which takes a lot of time. A LinkedList should behave better.

Finally, instead of searching for duplicates, consider to check for duplicates when filling the list or even use a Map.

Tobias

tobias.winterhaltera at 2007-7-12 9:48:58 > top of Java-index,Java Essentials,Java Programming...
# 3
Yes I know, Ste can be used, but in my case the Arraylist would not contain only string but the bean with attributes name and ID. so I am comparing the both. And some times the name is blank or null and set can hold only one blank or null.Thanks for your help.
shinde_yogesha at 2007-7-12 9:48:58 > top of Java-index,Java Essentials,Java Programming...