gemini - kennedy.gemi.dev

💾 Archived View for gemini.spam.works › mirrors › textfiles › programming › sort.txt captured on 2023-01-29 at 11:30:22.
-=-=-=-=-=-=-
Note: At the end of the document is a discussion of the various techniques,
      a short evaluation of the goods and bads, and also a small enhancement
      to the Bubble-Sort algorithm called Back-Track (in C).
                - Amit Margalit (9-MAY-1992)

Information
???????????

By definition, sort routines place un-ordered numeric or string data into
ascending or descending order. Many sort algorithms have been discovered,
and delineating their characteristics is a favorite pastime of Pascal
tutorials.

Bubble Sort

The bubble sort algorithm derives its name from the way it systematically
floats the largest values in an array to the top, like bubbles in a bottle
of soda.

The algorithm goes like this: starting at the bottom of the array, compare
each adjacent pair of values; if the lower member of the pair is greater
than the upper member, swap. Once you've worked your way to the top in this
fashion, theArray[ARRAYSIZE] is known to be the largest number in the
array. Now repeat the process, but this time stop one element short of the
top (since we already know this value is the largest). Repeat this process
until finally on the last pass you examine only the bottom two members of
the array.

   procedure BubbleSort (var theArray: arrayType);
   var
     last, current : integer;

   begin
     for last := ARRAYSIZE downto 2 do
       for current := 1 to last-1 do
         if theArray[current] > theArray[current+1] then
           Swap(theArray[current], theArray[current+1]);
   end; { BubbleSort }

Insertion Sort

The insertion sort is on the simple end of the scale, and resembles the way
people sort a bridge hand. Working from one end of the array to the other,
each value is placed in its correct position relative to the values that
have already been sorted.

The routines shown here are designed to process arrays of integers into
ascending order (i.e., theArray[1] gets the lowest value;
theArray[ARRAYSIZE] gets the highest), although with only trivial changes
the algorithms could easily work with string data and/or descending order.

   const
     ARRAYSIZE = 100;

   type
     arrayType = array[1..ARRAYSIZE] of integer;

   procedure InsertionSort (var theArray: arrayType );
   var
     newValue, newPos, currentPos: integer;
   begin
     for newPos := 2 to ARRAYSIZE do
     begin
       newValue := theArray[newPos];
       currentPos := newPos;
       while (currentPos > 1) and (theArray[currentPos-1] > newValue) do
       begin
         theArray[currentPos] := theArray[currentPos-1];
         currentPos := currentPos - 1;
       end;
       theArray[currentPos] := newValue;
     end;
   end; { InsertionSort }

Shell Sort

The Shell Sort was invented by a computer scientist named Donald Shell.  His
sorting method compares and swaps numbers the greatest distance from each
other first, shrinking the distance between numbers until you return to the
ones in closest proximity to each other.

   procedure ShellSort(var List: ListType; ListMax: integer);
   label
     ExitLoop;

   var
     Indx, Jndx, TheVal, Incr : integer;
   begin
     Incr := 1;
     repeat
       Incr := 3 * Incr + 1;
     until Incr > ListMax;
     repeat
       Incr := Incr div 3;
       for Indx := Incr + 1 to ListMax do
       begin
         TheVal := List[Indx];
         Jndx := Indx;
         while List[Jndx - Incr] > TheVal do
         begin
           List[Jndx] := List[Jndx - Incr];
           Jndx := Jndx - Incr;
           if Jndx <= Incr then
             goto ExitLoop;
         end;
   ExitLoop:
         List[Jndx] := TheVal;
       end;
     until Incr = 1;
   end; { ShellSort }

Quicksort

King of the sorting algorithms is Quicksort. The gist of Quicksort is
summarized here (in general it is a recursive algorithm):

-- Pick a pivot point in the center of the array

-- Partition the array:
   Exchange equal or larger elements (working from the left) with equal or
   smaller elements (working from the right). Repeat this process until the
   array has been organized such that every value left of the pivot point is
   smaller than every value right of the pivot point, and the value at the
   pivot point itself is in exactly the right position.

-- If there's more than one element left of the pivot point,
   call Quicksort recursively on the left-hand part of the array

-- If there's more than one element right of the pivot point,
   call Quicksort recursively on the right-hand part of the array

   procedure Quicksort(start, finish: integer);
   { sorts global variable theArray }
   var
     pivot, left, right : integer;
   begin
     { set pivot point }
     left := start;
     right := finish;
     pivot := theArray[(start + finish) div 2];
     { partition }
     repeat
       while theArray[left] < pivot do
         left := left + 1;
       while pivot < theArray[right] do
         right := right - 1;
       if left <= right then
       begin
         Swap( theArray[left], theArray[right] );
         left := left + 1;
         right := right - 1;
       end;
     until right <= left;
     { sort right and left halves }
     if start < right then QuickSort(start, right);
     if left < finish then QuickSort(left, finish);
   end; { QuickSort }

------------------------------------------------------------------------------
Following are a few additions by Amit Margalit
------------------------------------------------------------------------------

BackTrack - An enhancement of the Bubble-Sort.

The idea of BackTrack is simple: You run only once on the entire array, and
every time you find an element out of place, you search backwards through
the array to find where is the right place for it.

/* Assumes that we are sorting int's */

void backtrack(int* array, int nelem)
{
  int idx,temp;
  void track(int* arr,int cidx);

  if(nelem == 1)
    return;
  for(idx=1;idx < nelem; idx++) {
    if(*(array+idx) < *(array+idx-1)) {
      track(array,idx);
    }
  }
}

void track(int* array, int idx)
{
  do {
    swap(*(array+idx) , *(array+idx-1));        /* swap elements */
    idx--;                                      /* decrease index */
    } while ( (idx > 0) && (*(array+idx) < *(array+idx-1)) );
        /* repeat while index is 1 or more, and element still not in place */
}

swap(int* v1, int* v2)
{
  int temp;

  temp = *v1;
  *v1  = *v2;
  *v2  = temp;
}


Merge-Sort
----------
Actually, this is the most efficient of all algorithms (where speed is
concerned...). It needs twice as much storage space. Here are the basics:

Note: when you have two already-sorted arrays and you want to merge them into
 a single sorted array, you simply read the first elements, compare them. The
 one that fits to be before the other is sent to the output stream, and a new
 one from its array is read, and the comparison goes on again and again until
 the arrays are merged.

 - Divide the array into 2 separate ones.

 - If the number of elements on the left side is more than one, call
   merge-sort in recursion giving it the left array as an array to sort.

 - Do the same about the right side.

 - When both side have finished and returned, merge the array on the left
   with the one on the right.

Here is a C source (to sort an array of ints):


/* assume lower memory addresses on the left going up to the right */

mergesort(int* array, int length)
{
  int leftsize=length/2;
  int rightsize=length-leftsize;
  int lidx,ridx,nidx;
  int *newarray;

  if(leftsize>1)
    mergesort(array,leftsize);
  if(rightsize>1)
    mergesort(array+leftsize,rightsize);
  /* both sides are now sorted arrays, so we can merge the two sides */

  lidx=ridx=nidx=0;
  if((newarray=(int*)malloc(length*sizeof(int)))==NULL)
    error("Out of memory.");                  /* and terminate */
  for(;;) {
    if(*(array+leftsize+ridx) < *(array+lidx)) {
      *(newarray+nidx) = *(array+leftsize+ridx);
      nidx++; ridx++;
      }
    else {
      *(newarray+nidx) = *(array+lidx);
      nidx++; lidx++;
      }
    if(lidx>leftsize)                           /* no more? copy the rest */
      for(;ridx<=rightsize;ridx++,nidx++)
        *(newarray+nidx) = *(array+leftsize+ridx);
    if(ridx>leftsize)
      for(;lidx<=rightsize;lidx++,nidx++)
        *(newarray+nidx) = *(array+lidx);
    }
  for(lidx=0;lidx<length;lidx++)               /* copy the new array back */
    *(array+lidx) = *(newarray+lidx);
  free(newarray);                               /* and free the memory */
}

Comparing the algorithms:
-------------------------

Here is the list of the algorithms included in this article with a short note:

Bubble-Sort     Very inefficient, slow, but very easy to implement.
Back-Track      An improvement of the Bubble Sort.
Insertion-Sort  Also easy to implement, easiest to understand.
Shell-Sort      More efficient, a little hard to understand.
QuickSort       The best. A bit hard to understand and requires a large stack.
Merge-Sort      Always better than QuickSort, but needs twice the memory.

In detail:

BubbleSort
----------
Runs a long time, in little efficiency. Good if you wish to sort a small
number of elements (up to 20 maybe), don't care much about the time it
takes, or don't wanna start writing a more sophisticated sort routine. It's
the simplest one to implement, and has no flow-problems. (i.e. having to
exit from inside a loop or something like that).
 Time of execution: Depends only on length of array.

Back-Track
----------
Just a bit better. Runs only once on the entire array, and swaps things to
their right place. Useful also if you don't wanna start writing a more
sophisticated sort routine, are sorting a small array, and do care little
about the execution time.
 Time of execution: Depends on length of array, and its randomness (the more
                    the array is already sorted, the less it takes).

Insertion-Sort
--------------
Is good if you are sorting a linked-list. You really can't do it many other
ways... You copy the first node of the list to a new location, and read
through the original list, inserting the nodes in proper order to the new
list, by just keeping a temporary copy of the node, setting the appropriate
links of the node in the new list and the node to be inserted correctly.
  Time of execution: Depends on length of array, and the complexity of
                     changing the links in the list.

Shell-Sort
----------
The Shell-Sort use a kind of self-recursion. It runs on the large scale, then
descends into smaller regions of the array. Each time reducing the gap between
the elements it is sorting.

Quick-Sort
----------
Is not simple to implement, but is the most efficient. It is recursive and so
needs a large stack for a large array to be sorted. The idea is to divide the
array into 2 halves, picking the value of the element in the middle as a
pivot-value. Then you go all the way from the left and the right edges, and
swap any values that ought to be on opposite sides. When you get to the
middle point, you check to see if there are more than one elements on each
side, and if there are you call the routine again, sending it the half of the
array as the whole new array for it to sort. This is done in recursion.

Merge-Sort
----------
As has been said before, it needs memory twice the size of the array, to hold
both the array, and the sorted one (for the last pass). Its efficiency is the
best! Its execution time grows only with the growth of the array size, and is
not affected by the level of sortedness of the array preceding the sort.

                                        Amit Margalit, 11-May-1992