QuickSort(快速排序)
The basic algorithm(基本算法)
归并排序将数组分为两个子数组分别排序,并将有序的子数组归并使得整个数组排序;
快速排序通过一个切分元素将数组分为两个子数组,左子数组小于等于切分元素,右子数组大于等于切分元素,将这两个子数组排序也就将整个数组排序了。1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46public class Quick<T extends Comparable<T>> extends Sort<T> {
public void sort(T[] a){
shuffe(a); //消除对输入的依赖
sort(a,0,a.length-1);
}
private void sort(T[] a, int lo, int hi) {
if(hi<=lo){
return;
}
int j=partition(a,lo,hi); //切分
sort(a,lo,j-1); //将左半边排序
sort(a,j+1,hi); //将右半边排序
}
//切分方法
private int partition(T[] a, int lo, int hi) {
int i=lo;
int j=hi+1;
T v=a[lo]; //选取lo处的元素作为分界元素
while(true){
//从左向右扫描找到比v大的元素
while(less(a[++i],v)){
if(i==hi){
break;
}
}
//从右向左扫描找到比v小的元素
while(less(v,a[--j])){
if(j==lo){
break;
}
}
if(i>=j){
break;
}
exch(a,i,j);y
}
exch(a,lo,j);
return j;
}
//打乱数组
private void shuffe(T[] a){
List<T> list=Arrays.asList(a);
Collections.shuffle(list);
list.toArray(a);
}
}
Performance characteristics(性能分析)
A: advantage
a: it is in-place (uses only a small auxiliary stack) (原地排序)
b: it requires time proportional to N log N on the average to sort an array of length N (时间复杂度T(N)=O(NlgN))
c: the inner loop of quicksort (in the partitioning method) increments an index and compares an array entry against a fixed value.(内循环短小)
B: The best case for quicksort is when each partitioning stage divides the array exactlyin half. This circumstance would make the number of compares used by quicksort satisfy the divide-and-conquer recurrence CN = 2CN/2 + N,this recurrence has the solution CN ~ N lg N
C: Quicksort uses ~ 2N ln N compares (and one-sixth that many exchanges)on the average to sort an array of length N with distinct keys.
D: Quicksort uses ~ N * N/2 compares in the worst case, but random shuffling protects against this case.(最坏情况)
Algorithmic improvements(算法改进)
Cutoff to insertion sort (切换到插入排序)
对小数组排序时插入排序的效率比快速排序的效率高,以为快速排序递归在小数组中调用自己耗费时间
改进方法:
将
if(hi<=lo) return;
改为
if(hi<=lo+M){
Insertion(a,lo,hi);
return;
}
The optimum value of the cutoff M is system-dependent, but any value between 5 and 15 is likely to work well in most situations
这种方法主要用于提高对小型数组排序的速度问题
Median-of-three partitioning(三取样切分)
Entropy-optimal sorting(三向切分的快速排序)
这种算法是为了提高有大量重复元素的数组的排序效率,通过选定特定的切分元素v将数组切分为三个部分,小于v的元素、大于v的元素和小于v的元素1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33public class Quick3way<T extends Comparable<T>> extends Sort<T> {
public void sort(T[] a){
shuffe(a);
sort(a,0,a.length-1);
}
private void sort(T[]a,int lo,int hi){
if(hi<=lo){
return;
}
int lt=lo;
int i=lo+1;
int gt=hi;
T v=a[lo];
while(i<=gt){
int tep=a[i].compareTo(v);
if(tep<0){
exch(a,i++,lt++);
}else if(tep>0){
exch(a,i,gt--);
}else{
i++;
}
}
sort(a,lo,lt-1);
sort(a,gt+1,hi);
}
private void shuffe(T[] a){
List<T> list=Arrays.asList(a);
Collections.shuffle(list);
list.toArray(a);
}
}
算法分析:
- 对于包含大量重复元素的数组,三向切分的快速算法比标准的快速算法效率高的多,排序时间从线性对数级降低到了线性级别。
对于一个数组,它的所有主键的香农信息量 H=-(p1lgp1+p2lgp2 +…+pnlgpn)- 大小为N的数组,三向切分需要~(2ln2)NH次比较
- 三向切分最坏的情况是当所有的主键不重复时,H=lgN,时间复杂度为NlgN,是线性对数级别的;一般情况下,三向切分的运行时间和输入的信息量的N倍成正比,是线性级别的。