Sal's profileWindows Live spaceBlogNetwork Tools Help

Blog


    December 24

    N-Way Exchanges, Part 2

    Computational Requirements for an n-Way Exchange, Part I

     

    The underlying matching problems that describe an n-Way exchange can be solved with a variety of similar algorithms on a variety of different architectural platforms. For instance, the primary problem, optimal matching, can be solved in most programming languages, in memory, or on a database, on a single node or across a distributed cluster, given adequate CPU, memory, and bandwidth.

     

    Large scale solutions lend themselves very well to distributed memory architectures, in particular, Java Spaces (ala GigaSpaces), Distributed HashMaps (ala Oracle Coherence), even database clusters. Why? Here is the problem- n-Way exchanges are essentially large matching engines; more so, they are large multi-way matching engines. Simple matching engines, like a reconciliation system, perform 1->1 matching. N-Way exchanges may perform n->n matching, or set matching. This is far more computationally intensive.

     

    Essentially, set matching is a combination problem. In mathematics, a combination is an un-ordered collection of unique sizes. For example, given a set of integers, [1, 2], all combinations include [1],[2], [1,2]. [1,2] and [2,1] are equivalent here. A permutation is an ordered collection. For the same set of integers, all permutations include [1],[2], [1,2] and [2,1]. Here the ordering is significant and the last two sets are considered unique. The optimal solution to matching on an n-Way exchanges involves examining combinations for the optimal conditions.

     

    Problems of this nature are considered by Computational Complexity Theory to be NP-HARD. That is, they take non-deterministic polynomial time to compute, a potentially long time depending on the size of the problem, but are fairly rapid to prove. In our case, the size is the size of input sets (the buy side set and the sell side set). For other algorithms, like quicksort, the compute time is mainly driven by the size of sort input. It's called Polynomial Time because the runtime can be expressed as a polynomial function where the input is the size of the problem. For quicksort it is on average, n Log n comparisons where n is the number of items to sort, and at most n^2 comparisons.

     

    Whether the answer to a match request is optimal is rapid to prove relative to compute, which is required for NP-Hard problems. Once all the combinations are computed, a proof is simply checking each set for the criterion established to be optimum. In our case, the criterion might be minimizing the cost of shipping to the buyers or something more complex like effect on inventory, or a combination of factors. However, the cost of computing the cost each set is considered to be baked into the cost of computing the set - Yes, you may want to read that again :) - which is a rare convenience for our definition of NP-Hard.

     

    Let's look at a simple but practical example, an aggregating exchange where sellers are looking to sell to a pool of buyers. In this example a seller offers 10 HDTV sets (units) for $1000 each, a great deal. This offer is all-or-nothing, that is, exactly 10 units, not less or more. This somewhat realistic, since the units may come in pallets of 10 that the seller will not break if it results in a remainder. This is the state of the exchange at this moment:

     

    image

     

    Without optimization there are three solutions (combinations) that work, (501, 1001), (501, 320, 321), and (1001, 320, 321). Each combination results in 10 units sold. The exchange selects the winning set by selecting the optimal set. Again, this might be the set that minimizes shipping costs.

    A bit about exchanges as business entities- in almost all cases exchanges fulfill matches at the lowest possible price, defined as the least the buyer is willing to take. In this example the seller is willing to take $1000. All matches will be fulfilled at that price regardless of a higher bid. Now, that doesn't mean the exchange will ignore the bid price. The exchange may want to reward a buyer for being willing to pay more even he doesn't.

     

    However, in most cases where physical goods are involved shipping is a concern. Assuming that all the buyers are relatively the same distance/cost away from the fulfillment center, the next logical optimization is to minimize the shipping destinations. In this case that often means choosing the smallest set, (501, 1001), only two shipping destinations. The pallet is shipped to a fulfillment center, the pallet is broken down by a 3rd party fulfillment service, and two shipments are sent out.

     

    This is a simple example. A real example could have computed a whole set of costs for each set, shipping, fees, even probability of rejection at the destination, for instance, for damage, based on previous deliveries to that buyer or region. Some algorithms may not consider the smallest set optimal because it put too many eggs in each delivery basket in case of a rejection or payment problem. If two bids are exactly the same, but only one can be used, timestamp is usually important, first come, first serve. What is important is that once (or as) the possible combinations are generated each exchange will use a set of arbitrary tests to pick the optimal combination.

     

    So what is the hard computational part? To demonstrate, just take this example and increase the size of offer to 100 @ some price, all-or-nothing, and increase the number of bids to maybe 100 viable bids (over, or at, the ask price) at various sizes, and try crunching that on single node. It can take a while. Without algorithmic optimization, the system will have to generate many combinations and examine each one.

      

    How many combinations will the system have to look at? The number of possible combinations of a system n values is defined as (n! +1). Where "!" indicates a factorial. That is 3! = 1x2x3 and 5! = 1x2x3x4x5. 0! = 1. For us n is 100, because the total number of viable bids is 100. That is 100! + 1 or 9.33262154439441E+157 combinations.

     

    Next: Algorithmic Optimization

     


    Comments

    Please wait...
    Sorry, the comment you entered is too long. Please shorten it.
    You didn't enter anything. Please try again.
    Sorry, we can't add your comment right now. Please try again later.
    To add a comment, you need permission from your parent. Ask for permission
    Your parent has turned off comments.
    Sorry, we can't delete your comment right now. Please try again later.
    You've exceeded the maximum number of comments that can be left in one day. Please try again in 24 hours.
    Your account has had the ability to leave comments disabled because our systems indicate that you may be spamming other users. If you believe that your account has been disabled in error please contact Windows Live support.
    Complete the security check below to finish leaving your comment.
    The characters you type in the security check must match the characters in the picture or audio.

    To add a comment, sign in with your Windows Live ID (if you use Hotmail, Messenger, or Xbox LIVE, you have a Windows Live ID). Sign in


    Don't have a Windows Live ID? Sign up

    Trackbacks

    The trackback URL for this entry is:
    http://livinginthelab.spaces.live.com/blog/cns!9D3FF9467EE93B4D!957.trak
    Weblogs that reference this entry
    • None