11-18 01:57 阅读 256

React 算法应用之堆排序(实现堆排序算法)

二叉堆

二叉堆是一种特殊的堆，二叉堆是完全二叉树或者是近似完全二叉树。二叉堆有两种：最大堆和最小堆。

最大堆：父结点的键值总是大于或等于任何一个子节点的键值；

最小堆：父结点的键值总是小于或等于任何一个子节点的键值。

堆排序

堆排序（Heapsort）是指利用堆这种数据结构所设计的一种排序算法。堆积是一个近似完全二叉树的结构，并同时满足堆积的性质：即子结点的键值或索引总是小于（或者大于）它的父节点。堆排序可以说是一种利用堆的概念来排序的选择排序。分为两种方法：

大顶堆：每个节点的值都大于或等于其子节点的值，在堆排序算法中用于升序排列；
小顶堆：每个节点的值都小于或等于其子节点的值，在堆排序算法中用于降序排列；

我们对上图中的每个数进行标记，上面的结构映射成数组就变成了下面的样子

React 中的使用场景

在 React 中，并发任务经过 Scheduler 调度时，会被 Scheduler 包装成自己的task，这个task会被保存到最小堆timerQueue 或 taskQueue 中，以便可以在O(1) 的时间复杂度获取到优先级最高的 task。

// packages/scheduler/src/forks/Scheduler.js

if (startTime > currentTime) {
  // This is a delayed task.
  newTask.sortIndex = startTime;
  // timerQueue 是一个二叉堆结构，以最小堆的形式存储 task
  // 向二叉堆中添加节点
  push(timerQueue, newTask);
  
  // peek 查看堆的顶点, 也就是优先级最高的`task`
  if (peek(taskQueue) === null && newTask === peek(timerQueue)) {
    // ...
  }
} else {
  newTask.sortIndex = expirationTime;
  // taskQueue 是一个二叉堆结构，以最小堆的形式存储 task
  push(taskQueue, newTask);
  // ...
}复制代码

最小堆的源码实现在SchedulerMinHeap.js文件：

// src/react/packages/scheduler/src/SchedulerMinHeap.js

// 二叉堆数据结构
// 主要存储taskQueue 和 timerQueue，它们都是以 最小堆的形式存储
// 可以保证以O(1)的时间复杂度, 取到数组顶端的对象(优先级最高的 task).

type Heap = Array<Node>;
type Node = {|
  id: number,
    sortIndex: number,
|};

// 插入节点，一般是在数组的末尾插入新节点
// 然后调用 siftUp函数向上调整堆
export function push(heap: Heap, node: Node): void {
  const index = heap.length;
  heap.push(node);
  siftUp(heap, node, index);
}

// 查看 堆的顶点，也就是优先级最高的 task
export function peek(heap: Heap): Node | null {
  return heap.length === 0 ? null : heap[0];
}

// 删除节点，即删除堆顶节点，也就是取出最高优先级的 task
// 删除顶点之后，需要调用 siftDown 函数向下调整堆
export function pop(heap: Heap): Node | null {
  if (heap.length === 0) {
    return null;
  }
  const first = heap[0];
  const last = heap.pop();
  if (last !== first) {
    heap[0] = last;
    siftDown(heap, last, 0);
  }
  return first;
}

// 当插入节点之后, 需要向上调整堆结构, 保证数组是一个最小堆.
function siftUp(heap, node, i) {
  let index = i;
  while (index > 0) {
    const parentIndex = (index - 1) >>> 1;
    const parent = heap[parentIndex];
    if (compare(parent, node) > 0) {
      // The parent is larger. Swap positions.
      heap[parentIndex] = node;
      heap[index] = parent;
      index = parentIndex;
    } else {
      // The parent is smaller. Exit.
      return;
    }
  }
}

// 向下调整堆结构, 保证数组是一个最小堆.
function siftDown(heap, node, i) {
  let index = i;
  const length = heap.length;
  const halfLength = length >>> 1;
  while (index < halfLength) {
    const leftIndex = (index + 1) * 2 - 1;
    const left = heap[leftIndex];
    const rightIndex = leftIndex + 1;
    const right = heap[rightIndex];

    // If the left or right node is smaller, swap with the smaller of those.
    if (compare(left, node) < 0) {
      if (rightIndex < length && compare(right, left) < 0) {
        heap[index] = right;
        heap[rightIndex] = node;
        index = rightIndex;
      } else {
        heap[index] = left;
        heap[leftIndex] = node;
        index = leftIndex;
      }
    } else if (rightIndex < length && compare(right, node) < 0) {
      heap[index] = right;
      heap[rightIndex] = node;
      index = rightIndex;
    } else {
      // Neither child is smaller. Exit.
      return;
    }
  }
}

function compare(a, b) {
  // Compare sort index first, then task id.
  const diff = a.sortIndex - b.sortIndex;
  return diff !== 0 ? diff : a.id - b.id;
}复制代码

push 函数：插入新节点，将新节点插入到数组的末尾，然后调用 siftUp函数向上调整堆
peek 函数：查看堆的顶点，也就是优先级最高的 task
pop 函数：删除堆顶节点，也就是取出最高优先级的 task，然后需要调用 siftDown 函数向下调整堆
siftUp 函数：当插入新节点之后, 需要向上调整堆结构, 保证数组是一个最小堆
siftDown 函数：向下调整堆结构, 保证数组是一个最小堆
compare 函数：比较两个节点值的大小

总结

React 在实现任务调度时，使用最小堆存储 Scheduler 的 task，使得在任务调度的过程中可以在O(1)的时间获取到优先级最高的task，提高了任务调度的效率。这种设计思想值得我们借鉴学习。

作者：紫圣
链接：https://juejin.cn/post/7031680114236751886