Finding All subsets of a set
Hi,
I tried to find all the subsets of a set. But it is taking a long time. If the number of elements is 9, it is taking around 2 minutes. if it is increased to 13, it is taking a lot more time (I waited for more than 4-5 minutes, but no output yet). Is there any efficient algorithm to find All subsets of a set.
I am using java 1.5 with generics in the algorithm.
Thanks,
Phani
# 1
> Is there any efficient algorithm to find All subsets of a set.No. The number of possible combinations explodes exponentially.Welcome to NP, O(2^n) O(n^n) O(n!)
# 2
Indeed, the number of combinations is quite big. A good algorithm for generating combinations can be found here: http://en.wikipedia.org/wiki/Combinadic. You just have to adapt it for subsets.The complexity of the algorithm is not influenced by the Java version or the usage of
# 3
> Hi,
>
> I tried to find all the subsets of a set. But it is
> taking a long time. If the number of elements is 9,
> it is taking around 2 minutes. if it is increased to
> 13, it is taking a lot more time (I waited for more
> than 4-5 minutes, but no output yet). Is there any
> efficient algorithm to find All subsets of a set.
>
> I am using java 1.5 with generics in the algorithm.
>
> Thanks,
> Phani
Phani,
I developed an algorithm that computes the power set (set of all subsets) excluding null set. For a set of 9 elements (Integer) it takes less than a second.
I benchmarked it for a set of 15 elements (Integer) and it takes about 330 seconds.
Would you want to share your code?
import java.util.*;
/*
* @author Manoj Dhanji
*/
public class TestQ1
{
final static java.text.DecimalFormat df = new java.text.DecimalFormat("00");
final static double HUNDRED = 100.0d;
public static void populateArray(Object[] o){
for (int i = 0;i<o.length; i++){
o[i] = new Integer(i);
}
}
public static String displayArray(Object[] o){
StringBuffer buffer = new StringBuffer();
buffer.append("[");
int i = 0;
for (;i><o.length-1; i++)
buffer.append(o[i]+",");
buffer.append(o[i]);
buffer.append("]");
return buffer.toString();
}
public static void main(String[] args)
{
if(args.length>0){
try{
int size = Integer.parseInt(args[0]);
Object [] o = new Object[size];
populateArray(o);
Collection coll = new HashSet();
Date d1 = new Date();
System.out.println("@@Determing subsets for: "+displayArray(o));
System.out.println("@@Start time: "+d1.toLocaleString());
getSubsets(o, coll, 1);
Date d2 = new Date();
System.out.println("@@End time: "+d2.toLocaleString());
System.out.println("@@Power Set: "+coll);
System.out.println("@@Size: "+coll.size());
System.out.println("@@Time: "+df.format((d2.getTime() - d1.getTime())/1000)+" seconds");
}
catch(NumberFormatException nfe){
nfe.printStackTrace(System.err);
}
}
else
System.err.println("@@Usage: com.sheridan.prog38081.TestQ1 arg");
}
public static void getSubsets(Object[] o, Collection coll, int n){
for (int i = 0;i<o.length ;i++ ){
Set s = new HashSet();
if(n==1){
s.add(o[i]);
coll.add(s);
}else if(n>1){
if(n>o.length)
return;
generateHigherOrderSubSets(coll, n, o);
}
}
getSubsets(o, coll, ++n);
}
public static void generateHigherOrderSubSets(Collection coll, int n, Object[] o){
List list1 = getList(coll, 1);
List list2 = getList(coll, n-1);
String pMessage=new String("@@Completed :"+ df.format(coll.size()*HUNDRED/Math.pow(2, o.length))+"%");
for (int x = 0;x<list2.size() ;x++ ){
String nMessage=new String("@@Completed :"+ df.format(coll.size()*HUNDRED/Math.pow(2, o.length))+"%");
if(!pMessage.equals(nMessage)){
pMessage=nMessage;
System.out.println(pMessage);
}
for (int y= 0;y<list1.size() ;y++ ){
Set s = new HashSet();
if(!getListOfObjectsFromASetContainedInAList(list2, x).contains(getListOfObjectsFromASetContainedInAList(list1, y).get(0))){
List l1 = getListOfObjectsFromASetContainedInAList(list2, x);
l1.add(getListOfObjectsFromASetContainedInAList(list1, y).get(0));
for (int z=0;z<l1.size() ;z++ )
s.add(l1.get(z));
if(!coll.contains(s))
coll.add(s);
}
}
}
}
//From a list that that contains Sets
//Copy the Set to an Object array
//transform the array into a List
public static List getListOfObjectsFromASetContainedInAList(List list, int x){
Object [] o = (Object[])((Set)list.get(x)).toArray(new Object[0]);
List l = new ArrayList();
for (int i = 0 ;i><o.length ;i++ )
l.add(o[i]);
return l;
}
//From a Collection
//get a List of Sets of a certain size
public static List getList(Collection coll, int size){
List list = new ArrayList();
for (Iterator iter = coll.iterator();iter.hasNext(); ){
Set s = (Set)iter.next();
if(s.size()==size){
list.add(s);
}
}
return list;
}
}
>
# 4
I realized that there was a flaw in the getSubsets method. I was recursively calling the method and initiating the loop. Sorry, my bad. Here's a clean up. A set of 16 members costs only 58 seconds on a lousy machine.
import java.util.*;
import java.text.DecimalFormat;
/*
* Manoj Dhanji
*/
public class SetUtility{
public Collection getCollection(){
return this.coll;
}
private DecimalFormat df;
private Collection coll;
private Object[] o;
public SetUtility(Set set){
this(new java.text.DecimalFormat("00"), set);
}
public SetUtility(DecimalFormat df, Set set){
super();
this.df=df;
this.coll=Collections.synchronizedSet(new LinkedHashSet());
this.o=fromSetToArray(set);
System.out.println(display());
}
private static Object[] fromSetToArray(Set set){
return set.toArray(new Object[0]);
}
private String display(){
StringBuffer buffer = new StringBuffer();
buffer.append("[");
int i = 0;
for (;i<this.o.length-1; i++)
buffer.append(this.o[i]+",");
buffer.append(this.o[i]);
buffer.append("]");
return buffer.toString();
}
public void getSubsets(){
for (int i = 0;i<this.o.length ;i++ ){
Set s = new LinkedHashSet();
s.add(o[i]);
coll.add(s);
}
getSubsets(1);
}
private void getSubsets(int n){
if(n><1)
return;
else if(n>1){
if(n>this.o.length)
return;
generateHigherOrderSubSets(n);
}
getSubsets(++n);
}
private String getCompleteMessage(int a, int b){
return new String("@@Completed :"+ this.df.format(a*100.0/Math.pow(2, b))+"%");
}
private void generateHigherOrderSubSets(int n){
List list1 = getList(coll, 1);
List list2 = getList(coll, n-1);
String pMessage=getCompleteMessage(coll.size(), this.o.length);
for (int x = 0;x<list2.size() ;x++ ){
String nMessage=getCompleteMessage(coll.size(), this.o.length);
if(!pMessage.equals(nMessage)){
pMessage=nMessage;
System.out.println(pMessage);
}
for (int y= 0;y<list1.size() ;y++ ){
Set s = new LinkedHashSet();
if(!getListOfObjectsFromASetContainedInAList(list2, x).contains(getListOfObjectsFromASetContainedInAList(list1, y).get(0))){
List l1 = getListOfObjectsFromASetContainedInAList(list2, x);
l1.add(getListOfObjectsFromASetContainedInAList(list1, y).get(0));
for (int z=0;z<l1.size() ;z++ )
s.add(l1.get(z));
if(!coll.contains(s))
coll.add(s);
}
}
}
}
//From a list that that contains Sets
//Copy the Set to an Object array
//transform the array into a List
private List getListOfObjectsFromASetContainedInAList(List list, int x){
Object [] localObjectArray = (Object[])((Set)list.get(x)).toArray(new Object[0]);
List l = new ArrayList();
for (int i = 0 ;i><localObjectArray.length ;i++ )
l.add(localObjectArray[i]);
return l;
}
//From a Collection
//get a List of Sets of a certain size
private static List getList(Collection coll, int size){
List list = new ArrayList();
for (Iterator iter = coll.iterator();iter.hasNext(); ){
Set s = (Set)iter.next();
if(s.size()==size){
list.add(s);
}
}
return list;
}
}
--
Driver program
--
import java.util.*;
public class DriverSetUtility
{
public static void main(String[] args)
{
if(args.length>0){
try{
final java.text.DecimalFormat df = new java.text.DecimalFormat("00");
int size = Integer.parseInt(args[0]);
Set s = new LinkedHashSet();
for (int i = 0;i<size;i++ )
s.add(new Integer(i));
System.out.println("@@Determing subsets for: ");
SetUtility setUtility = new SetUtility(s);
Date d1 = new Date();
System.out.println("@@Start time: "+d1.toLocaleString());
setUtility.getSubsets();
Date d2 = new Date();
System.out.println("@@End time: "+d2.toLocaleString());
System.out.println("@@Power Set: "+setUtility.getCollection());
System.out.println("@@Size: "+setUtility.getCollection().size());
System.out.println("@@Time: "+df.format((d2.getTime() - d1.getTime())/1000)+" seconds");
}
catch(NumberFormatException nfe){
nfe.printStackTrace(System.err);
}
}
else
System.err.println("@@Usage: com.sheridan.prog38081.DriverSetUtillity arg");
}
}
>
# 5
You can exploit the correspondence of the members of the power set to the numbers 0..2^n by creating your own Set implementations which store a reference to the original elements and a bitmask to indicate which elements are present:
class PowerSet<T> extends AbstractCollection<Set><T>> implements Collection<Set><T>> {
final T[] elts;
final intsize;
final int hashCode;
final intn;
PowerSet (Set<T> source) {
this.n = source.size();
this.elts = (T[])source.toArray();
this.size = 1 << n;
this.hashCode = (1 << (n-1)) * Arrays.hashCode(this.elts);
}
public int hashCode () { return this.hashCode; }
public boolean equals (Object e) {
return false;
}
public int size () {
return size;
}
class BitMaskSet extends AbstractCollection<T> implements Set<T> {
final int mask;
BitMaskSet (int mask) {
this.mask = mask;
}
public int hashCode () {
int hashCode = 0;
for (int i = 0, mask = this.mask; mask > 0; mask >>>= 1, ++i) {
if ((mask&1)==1) hashCode += elts[i].hashCode();
}
return hashCode;
}
public int size () {
int size = 0;
for (int mask = this.mask; mask > 0; mask >>>= 1) {
size += mask&1;
}
return size;
}
public Iterator<T> iterator () {
return new Iterator<T> () {
int i = 0;
int mask = BitMaskSet.this.mask;
public T next () {
while ((mask&1)==0) {
++i;
mask >>>= 1;
}
final T next = elts[i];
++i;
mask >>>= 1;
return next;
}
public boolean hasNext () {
return mask != 0;
}
public void remove () {
throw new UnsupportedOperationException();
}
};
}
}
public Iterator<Set><T>> iterator () {
return new Iterator<Set><T>> () {
int i = 0;
public Set<T> next () {
return new BitMaskSet(i++);
}
public boolean hasNext () {
return i < size;
}
public void remove () {
throw new UnsupportedOperationException();
}
};
}
};
Changing ManojD 's driver to iterate over each element of each set in the powerset instead of printing the whole thing out (which ensures that all sets are created, but doesn't mean you're waiting for thousands of sets to stream by) and moving the construction of d2to after that loop so it's included in the time, to give a fair benchmark for the lazy iterators:
final Collection<Set><Integer>> p = setUtility.getCollection();
if (size < 18) System.out.println("@@Power Set: " + p);
int hash = 0;
for (Set<Integer> e : p) {
for (Integer i : e) {
hash += i.hashCode();
}
}
System.out.println("@@Size: "+p.size());
System.out.println("@@TotalHash: "+hash);
Date d2 = new Date();
System.out.println("@@End time: "+d2.toLocaleString());
This gives around 8s to create and iterate over the power set of a set of 25 integers (PZ25), on a machine where ManojD's version took 328 s to output PZ18 (24s to run PZ16).
Usually you can change algorithms to use iterators, and using immutable lightweight objects rather than storing the elements in the general purpose classes gets better results, in this case around an 1000 times throughput. It also can iterate through PZ30 (263s), which would run a 32bit machine out of memory if you were to store all the results; for >30 you'd use longs rather than ints for the mask, though very soon you'd hit a limit where the time to iterate over all combinations becomes prohibitive.
Pete
# 6
Hi Pete,
Thanks for your reply.
I think there can be no other fastest way to find the powerset as you are using a brilliant technique of masking the bits.
Once again Hatsoff for your logic.
Now I can move forward with my petty project.
@Manoj
Your Algorithm also working fine. Thanks for your reply.
-Phani
# 7
HiI have an algorithm but it works till 19 elements after it it will crash.
# 8
There is a 3 line recursive algorithm at:
http://en.wikipedia.org/wiki/Power_set
19 is pretty good... the real limitation for this algorithm is going to be the amount of memory available. 19 is 2^19 or 524,000 sets...since java has a lot of memory overhead thats probably the best your going to do. Once you write into VM your application is going to die cause VM writes are soooo much slower than regular memory.
Depending upon your application, you could probably modify your algorithm to enumerate all the sets in the power set rather than return the entire set. That way you generate a set in the power set, perform an operation on the set, then toss the set. Thus you take O(1) memory at any given time(but still O(2^n) over the course of the algorithm).
Something like:
Enumeration e = new PowerSet(Set s)
while(e.hasMoreElements())
{
someClass.doSomething(e.nextElement())
}
The enumeration then generates the set in the power set as needed and not all at once.
# 9
the problem is, my algorithm generates new elements depending on old ones I wrote my code in other topic http://forum.java.sun.com/thread.jspa?threadID=5189207
# 10
> ...
> Once you write into VM your
> application is going to die cause VM writes are soooo
> much slower than regular memory.
That's a bit old school-thinking:
http://www-128.ibm.com/developerworks/java/library/j-jtp09275.html
Also check out the references in that article.
# 11
Hi!I think the bottleneck is that you are using integers, if you want to overcome this, you can use the class BigInteger. The usage is more complex than the primitives values since all the operators are methods.Best regards