2009-12-18

Merge sort in Scala and Processing (2/3)

(Part 1 2 3)


For my merge sort demonstration, I wanted a factory object producing snapshots of progressive sorting in an array of integers. It would essentially be a (partial) function with the signature
(Array[Int]) ⇒ RepeatingIterator[Int].


The argument to the factory is an array of integers that represents the initial state, before sorting.


The class RepeatingIterator is derived from Iterator. It repeats infinitely by yielding the first item after the last, and so on.


  object SortStates {
    def apply (init: Array[Int]) = {
       Create stages 

       Define conversion function 

      new RepeatingIterator(Seq(init) ++
        (stages map convFn))
    }
  }


Create stages


The states of partial to complete ordering are mapped from an instance of Stages (which is a structure that will be described in part 3; suffice to say that it provides a method def map (f: List[Int] ⇒ Array[Int])).


      val size   = init.size
      val stages = new Stages(size)

Define conversion function


The function passed to map is an object whose apply method takes two arguments: a copy of the initial state, and one of the stages mapped over.


      val convFn = StateFunc(init take size) _

The StateFunc object


This object is a function (Array[Int])(List[Int]) ⇒ Array[Int] that takes as its first argument the previous state of the values. The second argument is a list that describes one stage in the merging process. After building a state with buildState, it is written back to the data parameter and finally returned.

object StateFunc {
   Define merge method 

   Define splitMerge method 

   Define buildState method 

  def apply (data: Array[Int])(stage: List[Int]) = {
    val state =
      buildState(data, stage).flatten.toArray

    for (i ← 0 until data.size)
      data(i) = state(i)

    state
  }
}

Define buildState method


The buildState method builds a new state by merging the sorted lists of the previous state. One merge operation is done for each element of the stage data structure. During each merge operation, the following constants are defined:


n
the length of the resulting sublist
s
the total length of the sublists that are already merged
r
a list of n values past the s first values of data

If the sublist length (n) is 1, it is already sorted and r is simply added to the result. Otherwise, r is passed to the method splitMerge.


  private def buildState (data: Array[Int], stage: List[Int]) = {
    for {
      i ← 0 until stage.size
      n = stage(i)
      s = (stage take i sum)
      r = (data drop s take n).toList
    } yield if (n == 1) r else splitMerge(r)
  }

Define splitMerge method


The splitMerge method splits its parameter (which has n values) into two sublists with n / 2 and n - n / 2 values, respectively. The subroutine merge is then applied to the two lists.


  private def splitMerge (r: List[Int]) = {
    val n   = r.size
    val n_2 = n / 2
    val xs1 = r take n_2
    val xs2 = r drop n_2 take (n - n_2)
    merge(xs1, xs2)
  }


Define merge method


The merge method is the same as in the mergeSort function in part 1.


  private def merge (a: List[Int], b: List[Int]): List[Int] =
    (a, b) match {
      case (_, Nil) ⇒ a
      case (Nil, _) ⇒ b
      case (ah :: at, bh :: _) if (ah <= bh) ⇒
        ah :: merge(at, b)
      case (_, bh :: bt) ⇒ bh :: merge(a, bt)
    }

And that's it for SortStates and StateFunc. In the final part I will show Stages and RepeatingIterator.

No comments:

Post a Comment