When I worked on name mangling in SlimIt I needed a function that would generate subsets of a character string in lexicographic order.

For example, if I had a character sequence *‘abc’* I would expect to call my function and get items in the following order: *‘a’, ‘b’, ‘c’, ‘ab’, ‘ac’, ‘bc’, ‘abc’*. The produced sequence is called a power set and binary counting is a well known algorithm for generating such power sets .

The idea is that we represent our character sequence as a binary string of ** n** bits, then start counting from zero to

*2**n – 1*and if a bit

**is set to**

*k***1**then we would put

**element from the character sequence into an output set.**

*k*-thLet’s say we have a sequence ‘abc’, then counting to

*2**3 – 1*would produce binary numbers 000, 001, 010, 011, 100, …, which correspond to ”, ‘c’, ‘b’, ‘bc’, ‘a’, …, in the character sequence.

Code for binary counting looks like this (credit for original implementation goes to Tim Peters):

def powerset(s): """powerset('abc') -> a b ab c ac bc abc""" pairs = [(2**index, char) for index, char in enumerate(s)] for index in xrange(1, 2**len(s)): yield ''.join(char for mask, char in pairs if mask & index)

And while it generates all necessary subsets they are not in lexicographic order.

So I decided to look at *itertools* module and check closer some properties of the *combinations* function available there and see if it could fit for my purposes and – wouldn’t you know it? – at the very bottom of the module’s documentation page I found a powerset recipe that did just what I needed using *combinations* function.

Here is a little bit adapted version of it:

import itertools def powerset(s): """powerset('abc') -> a b c ab ac bc abc""" for chars in itertools.chain.from_iterable( itertools.combinations(s, r) for r in range(1, len(s)+1) ): yield ''.join(chars)

I’ve been using Python for a long time now and still continue to bump into elegant solutions in Python standard library and that’s what I call the joy of Python

{ 2 comments… read them below or add one }

Interesting – but lexicographic order would be (a ab abc ac b bc c) [ie. as they appear in a dictionary] – also, it seems that neither solution generates the empty set which is always a member of a power set.

Hi Martin,

1. As for lexicographic order I should’ve been more specific by saying: “… in lexicographic order for subsets of the same length”.

I.e. a < b < c, ab < ac < bc, and so on.

2. Because I didn't need an empty subset I started to count from 1. If you need that empty subset start counting from 0 and modify

xrangeandrangecorrespondingly in the code snippets.P.S. The details of the post definitely wouldn’t pass a rigorous academic analysis, but that was not the goal