Mark Gritter (markgritter) wrote,
Mark Gritter

For your further geeky delectation

"A New Method to Index and Query Sets" by J. Hoffmann and J. Koehler addresses a restricted instance of a problem that I've encountered in quite a few different areas. Given a poset P and some set S of elements from P, how can you efficiently determine whether a new element is less than (or greater than) some element of P?

This paper answers that problem for sets (ordered by inclusion) and makes use of the sets' internal structure. Their idea is to build a forest encoding all the set by using a total ordering of the elements within the sets. So if you're working with sets of the natural numbers, you have a tree containing all the sets that have a 1 in them, a three with all the sets that have 2 (but not 1), etc. The trees corresponding the minimal elements of your ordering will probably be much more populated than their peers, unfortunately, so this could pose some performance problems.

I ran into a similar problem building a structure for anagram (and word subset/superset) search. If you convert each word to a 'signature' by sorting the letters, then a trie doesn't work as nicely any more--- the 'A' branch has tons of words and the 'W' branch has just a few. I should have just used a hash table since prefix searching on signatures isn't particularly meaningful.

However, this algorithm could still be very useful in a lot of problem domains where the typical set is a small fraction of the total number of elements. The Klee implementation uses it to cache known satisfiability problem instances, where each set contains just a few of the enormous universe of possible clauses. (Which is how I got here.)

For further followup: the references in The Daskalakis et al paper referenced there uses a representation that decomposes P into chains but from a quick skimming of the paper I'm not 100% sure how to build such decompositions.
Tags: algorithm, programming
  • Post a new comment


    default userpic

    Your reply will be screened

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.