2014年11月15日星期六

[Data Structure] Find the kth largest or kth smallest elements in a given array

Given an array, which is not ordered. You are asked to find all kth largest or kth smallest elements in it.

This question is quite popular. A very naive method is to sort the array first, then we find all kth largest or kth smallest elements. The time complexity is typically O(nlogn).
However, we could use a data structure called heap to improve the time complexity.

We first handle the kth largest case. In this case, we could use a small-end heap with capacity k. Then we sweep the array, for each element, if the heap is not full, we put it in, other wise we compare it with the top element of the heap, if it is larger than the top, we throw the current top away and put the element into heap. At the end, the heap will filled with kth largest elements, with the smallest one exist on the top.

Why this work? We consider the situation when the heap is full. The top would be the smallest one among all k elements. When a new element is less than or equal to the top, we know that it could not be selected, since there already exist k elements large or equal to it. What if a new element is larger than top? we know top could be eliminate, since we have a new candidate, by removing the top, we get a group of better k large elements. So we remove the top and put the new element in. The process continue until we have examine all elements in the array. The time complexity is O(nlogk), so if k << n, it will much faster than O(nlogn).

Similarly, for the kth smallest case, we use big-end heap with capacity k. 

The code is below, we use PriorityQueue in Java as the heap and pass a comparator to it. We assume the input would be valid.


 public static void FindKthSmall(int[] A, int k) {
  int n = A.length;
  Comparator<Integer> myComparator = new Comparator<Integer>() {
   public int compare(Integer a, Integer b) {
    return b - a;
   }
  };
  PriorityQueue<Integer> pq = new PriorityQueue<Integer>(k, myComparator);
  for(int i = 0; i < n; i++) {
   if(pq.size() == k) {
    if(A[i] < pq.peek()) {
     pq.remove();
     pq.add(A[i]);
    }
   }
   else {
    pq.add(A[i]);
   }
  }
  while(pq.size() != 0) {
   System.out.print(pq.peek() + " ");
   pq.remove();
  }
  System.out.println("");
 }


 public static void FindKthLarge(int[] A, int k) {
  int n = A.length;
  PriorityQueue<Integer> pq = new PriorityQueue<Integer>(k);
  for(int i = 0; i < n; i++) {
   if(pq.size() == k) {
    if(A[i] > pq.peek()) {
     pq.remove();
     pq.add(A[i]);
    }
   }
   else {
    pq.add(A[i]);
   }
  }
  while(pq.size() != 0) {
   System.out.print(pq.peek() + " ");
   pq.remove();
  }
  System.out.println("");
 }

1 条评论: