Last Updated on March 2, 2023 by Prepbytes
Heap is a complete binary tree data structure with the property that the value of a parent node is either greater than or equal to (max heap) or less than or equal to (min-heap) the values of its children. Heaps are used for efficient algorithms in sorting, selection, and as a priority queue.
Heap sort is a comparison-based sorting algorithm that sorts an array by creating a binary heap data structure. It has an average and worst-case time complexity of O(n log n)
What is Heap Sort in Data Structure?
Heap sort is a comparison-based sorting algorithm that sorts an array in ascending or descending order by transforming it into a binary heap data structure. It uses the concepts of max-heap and min-heap, where in max-heap, the largest element is the root node, and in min-heap, the smallest element is the root node. The algorithm uses the binary heap structure to efficiently sort the elements.
Properties of Binary Heap:
- Complete Binary Tree: All levels except the last level are completely filled and the last level is filled from left to right.
- Shape Property: The value of each node is less than or equal to the value of its parent.
- Heap Property: The value of the parent node is always greater than the value of its child node in a max heap, and the value of the parent node is always less than the value of its child node in a min-heap.
Implementation of Heap Sort Algorithm:
Now we will discuss the implementation of the heap sort algorithm, but first, we discuss heapify
What is Heapify?
Heapify is the process of converting a binary tree into a valid heap data structure. This is usually achieved by starting from the last internal node (i.e., the parent of the rightmost leaf node) and working towards the root node, swapping the node with its largest (or smallest, depending on whether it is a max or min heap) child node until the heap property is satisfied. The process is repeated for the affected subtrees until the entire tree is a valid heap. The time complexity of heapify is O(log n), where n is the number of nodes in the tree.
Now we take an example, in the example, we have an unsorted array {1,3,5,4,6,13,10,9,8,15,17}.
Now if we observe total nodes are 11 and 5 are non-leaf nodes and 6 are leaf nodes and so the last non-leaf index is 4 and here [6 is our last leaf node ] so to build the heap,heapify only the nodes :[1,3,5,4,6] in reverse order.
So this is heapify, we use heapify in our heap sort algorithm
Heap Sort Algorithm
Here are the steps of the Heap Sort algorithm:
- Build a max-heap from the input data: The first step is to build a binary max-heap from the input data. This is done by transforming the array into a complete binary tree where the largest value is at the root node.
- Swap the first and last elements of the array: After building the max heap, the largest value will be at the root node. Swap this value with the last element of the array. This will place the largest value at the end of the array, which is its final position in the sorted array.
- Heapify the remaining data: After swapping the first and last elements, the heap size is reduced by one. The root node may now violate the max-heap property, so the algorithm needs to reheapify the data to restore the max-heap.
- Repeat steps 2 and 3 until the heap size is reduced to 1: The algorithm continues to repeat steps 2 and 3 until the heap size is reduced to 1. This means that the entire array has been sorted in ascending order.
- Return the sorted array: After the heap size has been reduced to 1, the array is sorted and can be returned as the final result.
Implementation of these steps:
#include <iostream> using namespace std; // To heapify a subtree rooted with node i // which is an index in arr[]. // n is size of heap void heapify(int arr[], int N, int i) { // Initialize largest as root int largest = i; // left = 2*i + 1 int l = 2 * i + 1; // right = 2*i + 2 int r = 2 * i + 2; // If left child is larger than root if (l < N && arr[l] > arr[largest]) largest = l; // If right child is larger than largest // so far if (r < N && arr[r] > arr[largest]) largest = r; // If largest is not root if (largest != i) { swap(arr[i], arr[largest]); // Recursively heapify the affected // sub-tree heapify(arr, N, largest); } } // Main function to do heap sort void heapSort(int arr[], int N) { // Build heap (rearrange array) for (int i = N / 2 - 1; i >= 0; i--) heapify(arr, N, i); // One by one extract an element // from heap for (int i = N - 1; i > 0; i--) { // Move current root to end swap(arr[0], arr[i]); // call max heapify on the reduced heap heapify(arr, i, 0); } } // A utility function to print array of size n void printArray(int arr[], int N) { for (int i = 0; i < N; ++i) cout << arr[i] << " "; cout << "\n"; } // Driver's code int main() { int arr[] = { 7, 10, 5, 3, 8 }; int N = sizeof(arr) / sizeof(arr[0]); // Function call heapSort(arr, N); cout << " Sorted array after heap sort \n"; printArray(arr, N); }
class HeapSort { public void sort(int arr[]) { int N = arr.length; // Build heap (rearrange array) for (int i = N / 2 - 1; i >= 0; i--) heapify(arr, N, i); // One by one extract an element from heap for (int i = N - 1; i > 0; i--) { // Move current root to end int temp = arr[0]; arr[0] = arr[i]; arr[i] = temp; // call max heapify on the reduced heap heapify(arr, i, 0); } } // To heapify a subtree rooted with node i which is // an index in arr[]. n is size of heap void heapify(int arr[], int N, int i) { int largest = i; // Initialize largest as root int l = 2 * i + 1; // left = 2*i + 1 int r = 2 * i + 2; // right = 2*i + 2 // If left child is larger than root if (l < N && arr[l] > arr[largest]) largest = l; // If right child is larger than largest so far if (r < N && arr[r] > arr[largest]) largest = r; // If largest is not root if (largest != i) { int swap = arr[i]; arr[i] = arr[largest]; arr[largest] = swap; // Recursively heapify the affected sub-tree heapify(arr, N, largest); } } /* A utility function to print array of size n */ static void printArray(int arr[]) { int N = arr.length; for (int i = 0; i < N; ++i) System.out.print(arr[i] + " "); System.out.println(); } // Driver's code public static void main(String args[]) { int arr[] = { 7,5,10,3,8 }; int N = arr.length; // Function call HeapSort ob = new HeapSort(); ob.sort(arr); System.out.println("Sorted array after heap sort"); printArray(arr); } }
# Python program for implementation of heap Sort # To heapify subtree rooted at index i. # n is size of heap def heapify(arr, N, i): largest = i # Initialize largest as root l = 2 * i + 1 # left = 2*i + 1 r = 2 * i + 2 # right = 2*i + 2 # See if left child of root exists and is # greater than root if l < N and arr[largest] < arr[l]: largest = l # See if right child of root exists and is # greater than root if r < N and arr[largest] < arr[r]: largest = r # Change root, if needed if largest != i: arr[i], arr[largest] = arr[largest], arr[i] # swap # Heapify the root. heapify(arr, N, largest) # The main function to sort an array of given size def heapSort(arr): N = len(arr) # Build a maxheap. for i in range(N//2 - 1, -1, -1): heapify(arr, N, i) # One by one extract elements for i in range(N-1, 0, -1): arr[i], arr[0] = arr[0], arr[i] # swap heapify(arr, i, 0) # Driver's code if __name__ == '__main__': arr = [7, 10, 5, 3, 8] # Function call heapSort(arr) N = len(arr) print("Sorted array after heap sort") for i in range(N): print("%d" % arr[i], end=" ") # This code is contributed by Mohit Kumra # your code goes here
Output:
Sorted array after heap sort
3 5 7 8 10
Explanation
In the above example we have an unsorted array which is {7,10,5,3,8} and we apply heapsort algorithm and we got the array {3,5,7,8,10} which is a sorted array.
Time Complexity: O(N log N)
Auxiliary Space: O(1)
Optimizing Heap Sort:
Below are some of the methods by which we can optimize heap sort in some special cases –
- Using Min Heap instead of Max Heap: By using a min-heap instead of a max heap, the heap sort algorithm can be optimized for situations where the data is already partially sorted in ascending order.
- Building Heap Bottom-Up: Instead of building the heap from the top down, the heap can be built from the bottom up. This approach can reduce the number of swaps and comparisons required to build the heap, resulting in faster performance.
- Using Iterative Approach: Instead of using the recursive approach to heapify the heap, an iterative approach can be used. This can reduce the overhead of function calls and improve the performance of the heap sort algorithm.
- Using Multi-Threading: Heap sort can be parallelized to take advantage of multi-core processors and improve performance. In a multi-threaded implementation, each thread can be responsible for sorting a portion of the input array, and the results can be merged at the end.
- Using Hybrid Approach: A hybrid approach that combines the strengths of other sorting algorithms, such as quicksort and merge sort, with heap sort can result in improved performance. For example, heap sort can be used as the final sorting step after a preliminary quicksort or merge sort.
It’s important to note that optimizing heap sort depends on the specific requirements and constraints of a project. The best optimization strategy will vary depending on the size and structure of the input data, the memory and CPU resources available, and other factors.
When to Use Heap Sort:
Heap sort is often used when the data size is large and memory is limited. Since heap sort is an in-place sorting algorithm, it is more memory-efficient than other sorting algorithms like quicksort and merge sort. Heap sort is also used when stability is important, such as when sorting a list of records based on multiple fields.
Heap sort can also be used as an alternative to quicksort when the data is already partially sorted or the data has a lot of repeated values. In these cases, quicksort can become slow due to its worst-case time complexity of O(n^2). Heap sort, on the other hand, has a worst-case time complexity of O(nlogn), making it more reliable in these cases.
Heap Sort vs Other Sorting Algorithms:
Heap sort is often compared to other sorting algorithms like quicksort, merge sort, and bubble sort. Heap sort has a time complexity of O(nlogn), making it more efficient than bubble sort and selection sort, which have a time complexity of O(n^2). However, quicksort and merge sort also have a time complexity of O(nlogn) and are generally faster than heap sort in practice.
Heap sort is also an in-place sorting algorithm, which means that it sorts the elements within the input array without using any extra memory. This makes it more memory-efficient than quicksort, which requires extra memory for recursive calls, and merges sort, which requires extra memory to merge the subarrays.
Another advantage of heap sort is that it is a stable sorting algorithm, meaning the relative order of elements with equal values is preserved. Quicksort is not always stable, and bubble sort and selection sort are also not stable. Merge sort, on the other hand, is a stable sorting algorithm.
Advantages of Heap Sort:
- Efficient: Heap sort has a time complexity of O(nlogn), making it one of the most efficient sorting algorithms.
- In-Place Sorting: Heap sort is an in-place sorting algorithm, which means it sorts the elements within the input array without using any extra memory. This makes it more memory-efficient than other sorting algorithms like quicksort and merge sort.
- Stable Sorting: Heap sort is a stable sorting algorithm, which means that the relative order of elements with equal values is preserved. This is important in many applications where the elements are records with multiple fields and the sorting is based on multiple criteria.
- Easy to Implement: Heap sort is easy to implement and understand, making it a good choice for students and beginners who are learning data structures and algorithms.
Disadvantages of Heap Sort:
- Not the Fastest: Although heap sort is efficient, it is not the fastest sorting algorithm. Quicksort and merge sort are generally faster than heap sort in practice.
- Not Adaptive: Heap sort is not adaptive, which means it does not perform well on partially sorted arrays or arrays with many repeated values. In these cases, quicksort is a better choice.
- Space Complexity: Although heap sort is an in-place sorting algorithm, its space complexity is O(1), which is the same as other sorting algorithms like quicksort and merge sort. However, it is important to note that the space complexity of an algorithm refers to the amount of extra memory used by the algorithm and not the amount of memory used by the input data.
Applications of Heap Sort:
Below are some of the applications of Heap Sort –
- Sorting Large Datasets: Heap sort is a good choice for sorting large datasets due to its time complexity of O(nlogn) and its ability to sort elements in place. This makes it ideal for applications where memory is limited, such as embedded systems or mobile devices.
- Priority Queues: Heap sort is used in the implementation of priority queues, which are data structures that store elements with priorities and allow efficient access to the element with the highest priority.
- Graph Algorithms: Heap sort is used in graph algorithms, such as Dijkstra’s shortest path algorithm, to efficiently find the shortest path between two nodes in a graph.
- External Sorting: Heap sort is used in external sorting algorithms, which are used to sort large datasets that cannot fit in memory. In external sorting, the input dataset is divided into smaller chunks, each of which is sorted using an internal sorting algorithm. Heap sort is often used as the internal sorting algorithm.
- File Systems: Heap sort is used in file systems, such as NTFS, to sort directory entries and implement file indexing.
Conclusion:
Heap sort is a sorting algorithm that is based on the binary heap data structure. It has a time complexity of O(nlogn) and is considered a stable sorting algorithm, meaning that the relative order of equal elements is preserved in the sorted output. Heap sort is widely used due to its versatility, performance, and low memory requirements.
The algorithm consists of two main steps: building a max heap from the input data, successively removing the maximum element, and adjusting the heap until the input data is fully sorted. Heap sort has several important applications in various fields, including computer science, engineering, and finance, and can be optimized for specific use cases by using techniques such as building the heap bottom-up, using a min-heap instead of a max heap, and using a hybrid approach that combines heap sort with other sorting algorithms.
In summary, heap sort is a valuable tool for sorting large datasets and is an important algorithm to understand for anyone studying data structures and algorithms.
Frequently Asked Questions:
1)How does Heap Sort work?
Heap Sort works by building a binary heap (max-heap or min-heap) from the input array, and then repeatedly extracting the root node (the largest or smallest element) and swapping it with the last node in the heap. This process continues until all nodes have been removed from the heap and placed in the sorted array.
2) What is a binary heap?
A binary heap is a complete binary tree that satisfies the property of either a max-heap (where the parent node is greater than its children) or a min-heap (where the parent node is less than its children).
3)What is the time complexity of Heap Sort?
The time complexity of Heap Sort is O(n log n), where n is the number of elements in the input array.
4) What are the advantages of Heap Sort?
Heap Sort is efficient and has a relatively low overhead, as it only requires O(n) additional memory to store the binary heap. It is also easy to implement and understand.
5) What are the disadvantages of Heap Sort?
Heap Sort is not a stable sorting algorithm, meaning that equal elements may not retain their relative order in the sorted array. It also requires a lot of swaps and comparisons, making it slower than other sorting algorithms like Quick Sort.