我试图从这个网站了解快速排序算法,保罗的实施速度与stl ::
sort(在大范围内快速排序,在较小范围内插入排序)。
我将保罗的实施与我的比较,我的实施慢了3倍。通过分析我们的代码,我发现主要的不确定性是分区。
以下是保罗代码的摘录:
void Partition(int data[], int low , int high){
int pivot = data[low];
int fromLow = low;
int fromHigh = high;
int ptr = low;
goto QStart;
while(true){
do {
fromHigh--;
if(fromLow >= fromHigh)
goto QEnd;
QStart:;
}while(data[fromHigh] > pivot)
data[ptr] = data[fromHigh];
ptr = fromHigh;
do{
fromLow++;
if(fromLow >= fromHigh){
ptr = fromHigh;
goto QEnd;
}
}while(data[fromLow] < pivot)
data[ptr] = data[fromLow];
ptr = fromLow;
}
QEnd:;
data[ptr] = pivot;
}
以下是我的:
void MyPartition(int data[], int low , int high){
int pivot = data[low];
int fromLow = low;
int fromHigh = high;
int ptr = low;
while(fromLow != fromHigh){
if(data[fromHigh] >= pivot)
fromHigh--;
else{
data[ptr] = data[fromHigh];
ptr = fromHigh;
while(data[fromLow] <= pivot && fromLow != fromHigh)
fromLow ++;
data[ptr] = data[fromLow];
ptr = fromLow;
}
}
data[ptr] = pivot;
}
这两个函数实现了相同的算法,我相信它们具有相同的BigO:
- first, scanning array from high end to low end (right =>
left) for finding the first value is less than pivot.
- then, scanning array from low end to high end (left =>
right) for finding the first value is greater than pivot.
- In either scan, if anything is found, then we "swap it with
pivot value", this is logical swap,because pivot value is
cached with variable pivot, we can regard variable
ptr as the current pivot value
position.
有人知道为什么保罗的执行速度比我快吗?
UPDATE:
int PartitionData2(int * data, int low, int high){
int pivot = data[low];
int fromLow = low;
int fromHigh = high;
int ptr = low;
while(fromLow < fromHigh){
if(data[fromHigh] > pivot) //'>=' ==> '>'
fromHigh--;
else{
data[ptr] =data[fromHigh] ;
ptr = fromHigh;
while(data[++fromLow] < pivot && //'<=' ==> '<'
fromLow != fromHigh);
data[ptr] = data[fromLow];
ptr = fromLow;
fromHigh--; //antti.huima's idea
}
}
data[ptr] = pivot;
return ptr;
}
我只是根据antti.huima的想法更新代码,这使我的代码与paul的代码一样快。
它让我感到困惑,因为它看起来像交换值等于枢轴。
UPDATE2: I understand the reason why we need
move element which equals to pivot, because if we don't, the two
new partitions will be uneven, e.g. there should be one is much
larger than another. it eventually go to O(n^2) case.
see this PDF