Last Updated on August 21, 2023 by Mayank Dham
This article will delve into a renowned challenge known as Merge K Sorted Arrays. This particular problem holds significant prominence and is a common query posed during numerous technical interview sessions.
How to Merge K Sorted Array
Given K sorted arrays of size N each, Our task is to merge all the given arrays such that the final output array contains all these elements in sorted order.
Examples:
Input:
K = 3, N = 4,
arr1: {1, 3, 5, 7},
arr2: {2, 4, 6, 8},
arr3: {0, 9, 10, 11}
Output:
0 1 2 3 4 5 6 7 8 9 10 11
Explanation: The output array is a sorted array and it contains all the elements of the given input matrix.
Input:
k = 4, n = 4,
arr1: {13, 15, 16, 17},
arr2: {2, 4, 6, 8},
arr3: {0, 9, 10, 11}
Output:
0 2 4 6 8 9 10 11 13 15 16 17
Explanation: The output array is a sorted array and it contains all the elements of the given input matrix.
Approach 1:
Naive Approach for Merging k sorted arrays:
Algorithm:
Create an output array of size (NK), copy all elements to the output array, then sort the output array.
Step 1: Creates an output array of size N K.
Step 2: Traverses the matrix from beginning to end and inserts all elements into the output array.
Step 3: Sort and print the output array.
Code of the above approach:-
#include <bits/stdc++.h> using namespace std; #define N 4 // Merge arr1[0..N1-1] and arr2[0..N2-1] into // arr3[0..N1+N2-1] void mergeArrays(int arr1[], int arr2[], int N1, int N2, int arr3[]) { int i = 0, j = 0, k = 0; // Traverse both array while (i < N1 && j < N2) { // Check if current element of first // array is smaller than current element // of second array. If yes, store first // array element and increment first array // index. Otherwise do same with second array if (arr1[i] < arr2[j]) arr3[k++] = arr1[i++]; else arr3[k++] = arr2[j++]; } // Store remaining elements of first array while (i < N1) arr3[k++] = arr1[i++]; // Store remaining elements of second array while (j < N2) arr3[k++] = arr2[j++]; } // A utility function to print array elements void printArray(int arr[], int size) { for (int i = 0; i < size; i++) cout << arr[i] << " "; } // This function takes an array of arrays as an argument and // All arrays are assumed to be sorted. It merges them // together and prints the final sorted output. void mergeKArrays(int arr[][N], int i, int j, int output[]) { // If one array is in range if (i == j) { for (int p = 0; p < N; p++) output[p] = arr[i][p]; return; } // if only two arrays are left them merge them if (j - i == 1) { mergeArrays(arr[i], arr[j], N, N, output); return; } // Output arrays int out1[N * (((i + j) / 2) - i + 1)], out2[N * (j - ((i + j) / 2))]; // Divide the array into halves mergeKArrays(arr, i, (i + j) / 2, out1); mergeKArrays(arr, (i + j) / 2 + 1, j, out2); // Merge the output array mergeArrays(out1, out2, N * (((i + j) / 2) - i + 1), N * (j - ((i + j) / 2)), output); } // Driver's code int main() { // Change N at the top to change number of elements // in an array int arr[][N] = { { 2, 6, 12, 34 }, { 1, 9, 20, 1000 }, { 23, 34, 90, 2000 } }; int K = sizeof(arr) / sizeof(arr[0]); int output[N * K]; mergeKArrays(arr, 0, 2, output); // Function call cout << "Merged array is " << endl; printArray(output, N * K); return 0; }
Input
arr1: { 2, 6, 12, 34 },
arr2: { 1, 9, 20, 1000 },
arr3: { 23, 34, 90, 2000 }
Output
Merged array is
1 2 6 9 12 20 23 34 34 90 1000 2000
Time Complexity: O(N K log (NK)), Since the final array is of size NK.
Space Complexity: O(N K), The output array is of size N K.
Approach 2: Merge K Sorted Arrays using Merging:
In this approach, we start by merging the arrays into two groups. After the first merge, we are left with K/2 arrays. Now, Merge the array back into the group. This leaves us with a K/4 array. This approach is similar to mergesort. Split K arrays containing the same number of arrays in half until there are two arrays in the group. After this process, merge the arrays from bottom to top.
Follow the given steps to solve the problem:
Step 1: Create a recursive function that takes K arrays and returns the output array.
Step 2: In the recursive function, if the value of K is 1 then return the array else if the value of K is 2 then merge the two arrays in linear time and return the array.
Step 3: If the value of K is greater than 2 then divide the group of k elements into two equal halves and recursively call the function, i.e 0 to K/2 array in one recursive function and K/2 to K array in another recursive function.
Step 4: Print the output array.
Code of the above approach:-
// C++ program to merge K sorted arrays of size n each. #include <bits/stdc++.h> using namespace std; #define N 4 // Merge arr1[0..N1-1] and arr2[0..N2-1] into // arr3[0..N1+N2-1] void mergeArrays(int arr1[], int arr2[], int N1, int N2, int arr3[]) { int i = 0, j = 0, k = 0; // Traverse both array while (i < N1 && j < N2) { // Check if current element of first // array is smaller than current element // of second array. If yes, store first // array element and increment first array // index. Otherwise do same with second array if (arr1[i] < arr2[j]) arr3[k++] = arr1[i++]; else arr3[k++] = arr2[j++]; } // Store remaining elements of first array while (i < N1) arr3[k++] = arr1[i++]; // Store remaining elements of second array while (j < N2) arr3[k++] = arr2[j++]; } // A utility function to print array elements void printArray(int arr[], int size) { for (int i = 0; i < size; i++) cout << arr[i] << " "; } // This function takes an array of arrays as an argument and // All arrays are assumed to be sorted. It merges them // together and prints the final sorted output. void mergeKArrays(int arr[][N], int i, int j, int output[]) { // If one array is in range if (i == j) { for (int p = 0; p < N; p++) output[p] = arr[i][p]; return; } // if only two arrays are left them merge them if (j - i == 1) { mergeArrays(arr[i], arr[j], N, N, output); return; } // Output arrays int out1[N * (((i + j) / 2) - i + 1)], out2[N * (j - ((i + j) / 2))]; // Divide the array into halves mergeKArrays(arr, i, (i + j) / 2, out1); mergeKArrays(arr, (i + j) / 2 + 1, j, out2); // Merge the output array mergeArrays(out1, out2, N * (((i + j) / 2) - i + 1), N * (j - ((i + j) / 2)), output); } // Driver's code int main() { // Change N at the top to change number of elements // in an array int arr[][N] = { { 2, 6, 12, 34 }, { 1, 9, 20, 1000 }, { 23, 34, 90, 2000 } }; int K = sizeof(arr) / sizeof(arr[0]); int output[N * K]; mergeKArrays(arr, 0, 2, output); // Function call cout << "Merged array is " << endl; printArray(output, N * K); return 0; }
Input:
arr1: { 2, 6, 12, 34 },
arr2: { 1, 9, 20, 1000 },
arr3: { 23, 34, 90, 2000 }
Output
Merged array is
1 2 6 9 12 20 23 34 34 90 1000 2000
Time Complexity: O(N K log K). There are log K levels as in each level the K arrays are divided in half and at each level, the K arrays are traversed.
Space Complexity: O(N K log K). In each level O(N * K).
Approach 3: Merge K sorted arrays using Min-Heap:
Brief about Min-Heap
Min-heap is a min priority queue. It is a complete binary tree having a root value smaller than both children’s values.
2
/ \
4 5
/ \
11 6
Min-Heap
We will use a min-heap to get the current minimum value.
The idea is to use a minimum heap. The time complexity of this MinHeap-based solution is the same as O(NK log K). The first step should start by creating a MinHeap and inserting the first element of every k array into it. Now the root element of the min-cluster is the smallest of all elements. Then it removes the minheap’s root element, inserts it into the output array, and inserts the next element from the array of removed elements. To get the result, we need to continue stepping until there are no more elements in MinHeap.
Follow the given steps to solve the problem:
Step 1: Create a min Heap and insert the first element of all the K arrays.
Step 2: Run a loop until the size of MinHeap is greater than zero. Now, Remove the top element of the MinHeap and print the element. Then, insert the next element from the same array in which the removed element belonged. If the array doesn’t have any more elements, then replace root with infinite. After replacing the root, heapify the tree.
Step 3: Return the output array
Code of the above approach:-
#include <bits/stdc++.h> using namespace std; #define N 4 // A min-heap node struct MinHeapNode { // The element to be stored int element; // index of the array from which the element is taken int i; // index of the next element to be picked from the array int j; }; // Prototype of a utility function to swap two min-heap // nodes void swap(MinHeapNode* x, MinHeapNode* y); // A class for Min Heap class MinHeap { // pointer to array of elements in heap MinHeapNode* harr; // size of min heap int heap_size; public: // Constructor: creates a min heap of given size MinHeap(MinHeapNode a[], int size); // to heapify a subtree with root at given index void MinHeapify(int); // to get index of left child of node at index i int left(int i) { return (2 * i + 1); } // to get index of right child of node at index i int right(int i) { return (2 * i + 2); } // to get the root MinHeapNode getMin() { return harr[0]; } // to replace root with new node x and heapify() new // root void replaceMin(MinHeapNode x) { harr[0] = x; MinHeapify(0); } }; // This function takes an array of arrays as an argument and // All arrays are assumed to be sorted. It merges them // together and prints the final sorted output. int* mergeKArrays(int arr[][N], int K) { // To store output array int* output = new int[N * K]; // Create a min heap with k heap nodes. // Every heap node has first element of an array MinHeapNode* harr = new MinHeapNode[K]; for (int i = 0; i < K; i++) { // Store the first element harr[i].element = arr[i][0]; // index of array harr[i].i = i; // Index of next element to be stored from the array harr[i].j = 1; } // Create the heap MinHeap hp(harr, K); // Now one by one get the minimum element from min // heap and replace it with next element of its array for (int count = 0; count < N * K; count++) { // Get the minimum element and store it in output MinHeapNode root = hp.getMin(); output[count] = root.element; // Find the next element that will replace current // root of heap. The next element belongs to same // array as the current root. if (root.j < N) { root.element = arr[root.i][root.j]; root.j += 1; } // If root was the last element of its array // INT_MAX is for infinite else root.element = INT_MAX; // Replace root with next element of array hp.replaceMin(root); } return output; } // FOLLOWING ARE IMPLEMENTATIONS OF // STANDARD MIN HEAP METHODS FROM CORMEN BOOK // Constructor: Builds a heap from a given // array a[] of given size MinHeap::MinHeap(MinHeapNode a[], int size) { heap_size = size; harr = a; // store address of array int i = (heap_size - 1) / 2; while (i >= 0) { MinHeapify(i); i--; } } // A recursive method to heapify a // subtree with root at given index. // This method assumes that the subtrees // are already heapified void MinHeap::MinHeapify(int i) { int l = left(i); int r = right(i); int smallest = i; if (l < heap_size && harr[l].element < harr[i].element) smallest = l; if (r < heap_size && harr[r].element < harr[smallest].element) smallest = r; if (smallest != i) { swap(&harr[i], &harr[smallest]); MinHeapify(smallest); } } // A utility function to swap two elements void swap(MinHeapNode* x, MinHeapNode* y) { MinHeapNode temp = *x; *x = *y; *y = temp; } // A utility function to print array elements void printArray(int arr[], int size) { for (int i = 0; i < size; i++) cout << arr[i] << " "; } // Driver's code int main() { // Change N at the top to change number of elements // in an array int arr[][N] = { { 2, 6, 12, 34 }, { 1, 9, 20, 1000 }, { 23, 34, 90, 2000 } }; int K = sizeof(arr) / sizeof(arr[0]); // Function call int* output = mergeKArrays(arr, K); cout << "Merged array is " << endl; printArray(output, N * K); return 0; }
Input:
arr1: { 2, 6, 12, 34 },
arr2: { 1, 9, 20, 1000 },
arr3: { 23, 34, 90, 2000 }
Output:
Merged array is
1 2 6 9 12 20 23 34 34 90 1000 2000
Time Complexity: O(N K log K), Insertion and deletion in a Min Heap requires log k time.
Space Complexity: O(K), If the output is not stored, then the only space required is the Min-Heap of K elements.
Conclusion:
In conclusion, the process of merging K sorted arrays presents an essential problem-solving challenge often encountered in technical interviews. Employing a method reminiscent of mergesort, the arrays are systematically partitioned and merged to achieve an efficient merging of the K sorted arrays into a single, sorted array. This approach not only demonstrates algorithmic prowess but also showcases an ability to optimize array manipulation and merging operations.
Frequently Asked Questions (FAQs) related to merging K sorted arrays
Below are some FAQs related to Merge K Sorted Arrays:
1. Why is the problem of merging K sorted arrays significant?
Merging K sorted arrays is a fundamental problem in computer science and has practical applications in various domains, including data analysis, database management, and efficient sorting algorithms.
2. What is the time complexity of merging K sorted arrays using the described approach?
The time complexity of this approach is O(N K log(K)), where N is the average length of the arrays. The log(K) factor arises from the number of times the arrays are divided into groups.
3. Are there alternative methods for merging K sorted arrays?
Yes, there are other techniques, such as using priority queues (min-heaps), which can achieve a similar result with potentially better time complexity, around O(N * log(K)).
4. Can the described approach be extended to merging more than K sorted arrays?
Yes, the approach can be extended to merging more than K sorted arrays by consistently dividing and merging the arrays in groups until the desired result is achieved.
5. What are the main challenges when implementing the merging of K sorted arrays?
The main challenges include managing indices while iterating through the arrays, efficiently selecting the minimum element for merging, and handling edge cases when arrays have different lengths.
6. Is there a trade-off between time complexity and memory usage in merging K sorted arrays?
Yes, some advanced techniques may optimize for time complexity but require more memory, while others optimize memory usage at the cost of slightly higher time complexity. The choice depends on the specific constraints of the problem.