Retrieving maximum value from a range in unsorted array

I have an unsorted array. I have queries in which I give a range and then the maximum value from that range has to returned.
For example:

<code>array[]={23,17,9,45,78,2,4,6,90,1};

query(both inclusive): 2 6

answer: 78

</code>

<code>array[]={23,17,9,45,78,2,4,6,90,1}; query(both inclusive): 2 6 answer: 78 </code>

array[]={23,17,9,45,78,2,4,6,90,1};
query(both inclusive): 2 6
answer: 78

Which algorithm or data structure do I construct to quickly retrieve maximum value from any range. (There are a lot of queries)

EDIT:
This is indeed a simple version of the actual problem. I can have the array size as large as 100000 and the number of queries upto 100000. So I definitely require some preprocessing which’ll facilitate a fast query response.

I think you could construct some kind of binary tree where each node represents the maximum value its children:

<code> 78

45 78

23 45 78 6

23 17 9 45 78 2 4 6

</code>

            78           
     45            78     
  23    45     78      6  
23 17  9 45   78 2    4 6

Then you only need to find a way to determine which nodes you minimally need to check to find the maximum value in the range queried. In this example, to get the maximum value in the index range [2, 6] (inclusive) you would have max(45, 78, 4) instead of max(9, 45, 78, 2, 4). As the tree grows, the gain will be larger.

To complement ngoaho91’s answer.

The best way to solve this problem is using the Segment Tree data structure. This allows you to answer such queries in O(log(n)), that would mean the total complexity of your algorithm would be O(Qlogn) where Q is the number of queries. If you used the naive algorithm, the total complexity would be O(Qn) which is obviouslly slower.

There is, however, a drawback of the usage of Segment Trees. It takes up a lot of memory, but a lot of times you care less about memory than about speed.

I will briefly describe the algorithms used by this DS:

The segment tree is just an special case of a Binary Search Tree, where every node holds the value of the range it is assigned to. The root node, is assigned the range [0, n]. The left child is assigned the range [0, (0+n)/2] and the right child [(0+n)/2+1, n]. This way the tree will be built.

Create Tree:

<code>/*

A[] -> array of original values

tree[] -> Segment Tree Data Structure.

node -> the node we are actually in: remember left child is 2*node, right child is 2*node+1

a, b -> The limits of the actual array. This is used because we are dealing

with a recursive function.

int tree[SIZE];

void build_tree(vector<int> A, int node, int a, int b) {

if (a == b) { // We get to a simple element

tree[node] = A[a]; // This node stores the only value

}

else {

int leftChild, rightChild, middle;

leftChild = 2*node;

rightChild = 2*node+1; // Or leftChild+1

middle = (a+b) / 2;

build_tree(A, leftChild, a, middle); // Recursively build the tree in the left child

build_tree(A, rightChild, middle+1, b); // Recursively build the tree in the right child

tree[node] = max(tree[leftChild], tree[rightChild]); // The Value of the actual node,

//is the max of both of the children.

}

</code>

<code>/* A[] -> array of original values tree[] -> Segment Tree Data Structure. node -> the node we are actually in: remember left child is 2*node, right child is 2*node+1 a, b -> The limits of the actual array. This is used because we are dealing with a recursive function. */ int tree[SIZE]; void build_tree(vector<int> A, int node, int a, int b) { if (a == b) { // We get to a simple element tree[node] = A[a]; // This node stores the only value } else { int leftChild, rightChild, middle; leftChild = 2*node; rightChild = 2*node+1; // Or leftChild+1 middle = (a+b) / 2; build_tree(A, leftChild, a, middle); // Recursively build the tree in the left child build_tree(A, rightChild, middle+1, b); // Recursively build the tree in the right child tree[node] = max(tree[leftChild], tree[rightChild]); // The Value of the actual node, //is the max of both of the children. } } </code>

/*
    A[] -> array of original values
    tree[] -> Segment Tree Data Structure.
    node -> the node we are actually in: remember left child is 2*node, right child is 2*node+1
    a, b -> The limits of the actual array. This is used because we are dealing
                with a recursive function.
*/

int tree[SIZE];

void build_tree(vector<int> A, int node, int a, int b) {
    if (a == b) { // We get to a simple element
        tree[node] = A[a]; // This node stores the only value
    }
    else {
        int leftChild, rightChild, middle;
        leftChild = 2*node;
        rightChild = 2*node+1; // Or leftChild+1
        middle = (a+b) / 2;
        build_tree(A, leftChild, a, middle); // Recursively build the tree in the left child
        build_tree(A, rightChild, middle+1, b); // Recursively build the tree in the right child

        tree[node] = max(tree[leftChild], tree[rightChild]); // The Value of the actual node, 
                                                            //is the max of both of the children.
    }
}

Query Tree

<code>int query(int node, int a, int b, int p, int q) {

if (b < p || a > q) // The actual range is outside this range

return -INF; // Return a negative big number. Can you figure out why?

else if (p >= a && b >= q) // Query inside the range

return tree[node];

int l, r, m;

l = 2*node;

r = l+1;

m = (a+b) / 2;

return max(query(l, a, m, p, q), query(r, m+1, b, p, q)); // Return the max of querying both children.

}

</code>

<code>int query(int node, int a, int b, int p, int q) { if (b < p || a > q) // The actual range is outside this range return -INF; // Return a negative big number. Can you figure out why? else if (p >= a && b >= q) // Query inside the range return tree[node]; int l, r, m; l = 2*node; r = l+1; m = (a+b) / 2; return max(query(l, a, m, p, q), query(r, m+1, b, p, q)); // Return the max of querying both children. } </code>

int query(int node, int a, int b, int p, int q) {
    if (b < p || a > q) // The actual range is outside this range
        return -INF; // Return a negative big number. Can you figure out why?
    else if (p >= a && b >= q) // Query inside the range
        return tree[node];
    int l, r, m;
    l = 2*node;
    r = l+1;
    m = (a+b) / 2;
    return max(query(l, a, m, p, q), query(r, m+1, b, p, q)); // Return the max of querying both children.
}

If you need further explanation, just let me know.

BTW, Segment Tree also supports update of a single element or a range of elements in O(log n)

The best algorithm would be in O(n) time as below
let start, end be the index of the bounds of range

<code>int findMax(int[] a, start, end) {

max = Integer.MIN; // initialize to minimum Integer

for(int i=start; i <= end; i++)

if ( a[i] > max )

max = a[i];

return max;

}

</code>

<code>int findMax(int[] a, start, end) { max = Integer.MIN; // initialize to minimum Integer for(int i=start; i <= end; i++) if ( a[i] > max ) max = a[i]; return max; } </code>

int findMax(int[] a, start, end) {
   max = Integer.MIN; // initialize to minimum Integer

   for(int i=start; i <= end; i++) 
      if ( a[i] > max )
         max = a[i];

   return max; 
}

The binary tree/segment tree-based solutions are indeed pointing in the right direction. One might object that they require a lot of extra memory, however. There are two solutions to these problems:

Use an implicit data structure instead of a binary tree
Use an M-ary tree instead of a binary tree

The first point is that because the tree is highly structured, you can use a heap-like structure to implicitly define the tree rather than representing the tree with nodes, left and right pointers, intervals etc.. That saves a lot of memory with essentially no performance hit – you do need to perform a little more pointer arithmetic.

The second point is that, at the cost of a little more work during evaluation, you can use an M-ary tree rather than a binary tree. For instance if you use a 3-ary tree you will compute the max of 3 elements at a time, then 9 elements at a time, then 27, etc. The extra storage required is then N/(M-1) – you can prove using the geometric series formula. If you choose M = 11, for example, you will require 1/10th the storage of the binary tree method.

You can verify that these naive and optimized implementations in Python give the same results:

<code>class RangeQuerier(object):

#The naive way

def __init__(self):

pass

def set_array(self,arr):

#Set, and preprocess

self.arr = arr

def query(self,l,r):

try:

return max(self.arr[l:r])

except ValueError:

return None

</code>

<code>class RangeQuerier(object): #The naive way def __init__(self): pass def set_array(self,arr): #Set, and preprocess self.arr = arr def query(self,l,r): try: return max(self.arr[l:r]) except ValueError: return None </code>

class RangeQuerier(object):
    #The naive way
    def __init__(self):
        pass

    def set_array(self,arr):
        #Set, and preprocess
        self.arr = arr

    def query(self,l,r):
        try:
            return max(self.arr[l:r])
        except ValueError:
            return None

vs.

<code>class RangeQuerierMultiLevel(object):

def __init__(self):

self.arrs = []

self.sub_factor = 3

self.len_ = 0

def set_array(self,arr):

#Set, and preprocess

tgt = arr

self.len_ = len(tgt)

self.arrs.append(arr)

while len(tgt) > 1:

tgt = self.maxify_one_array(tgt)

self.arrs.append(tgt)

def maxify_one_array(self,arr):

sub_arr = []

themax = float('-inf')

for i,el in enumerate(arr):

themax = max(el,themax)

if i % self.sub_factor == self.sub_factor - 1:

sub_arr.append(themax)

themax = float('-inf')

return sub_arr

def query(self,l,r,level=None):

if level is None:

level = len(self.arrs)-1

if r <= l:

return None

int_size = self.sub_factor ** level

lhs,mid,rhs = (float('-inf'),float('-inf'),float('-inf'))

#Check if there's an imperfect match on the left hand side

if l % int_size != 0:

lnew = int(ceil(l/float(int_size)))*int_size

lhs = self.query(l,min(lnew,r),level-1)

l = lnew

#Check if there's an imperfect match on the right hand side

if r % int_size != 0:

rnew = int(floor(r/float(int_size)))*int_size

rhs = self.query(max(rnew,l),r,level-1)

r = rnew

if r > l:

#Handle the middle elements

mid = max(self.arrs[level][l/int_size:r/int_size])

return max(max(lhs,mid),rhs)

</code>

<code>class RangeQuerierMultiLevel(object): def __init__(self): self.arrs = [] self.sub_factor = 3 self.len_ = 0 def set_array(self,arr): #Set, and preprocess tgt = arr self.len_ = len(tgt) self.arrs.append(arr) while len(tgt) > 1: tgt = self.maxify_one_array(tgt) self.arrs.append(tgt) def maxify_one_array(self,arr): sub_arr = [] themax = float('-inf') for i,el in enumerate(arr): themax = max(el,themax) if i % self.sub_factor == self.sub_factor - 1: sub_arr.append(themax) themax = float('-inf') return sub_arr def query(self,l,r,level=None): if level is None: level = len(self.arrs)-1 if r <= l: return None int_size = self.sub_factor ** level lhs,mid,rhs = (float('-inf'),float('-inf'),float('-inf')) #Check if there's an imperfect match on the left hand side if l % int_size != 0: lnew = int(ceil(l/float(int_size)))*int_size lhs = self.query(l,min(lnew,r),level-1) l = lnew #Check if there's an imperfect match on the right hand side if r % int_size != 0: rnew = int(floor(r/float(int_size)))*int_size rhs = self.query(max(rnew,l),r,level-1) r = rnew if r > l: #Handle the middle elements mid = max(self.arrs[level][l/int_size:r/int_size]) return max(max(lhs,mid),rhs) </code>

class RangeQuerierMultiLevel(object):
    def __init__(self):
        self.arrs = []
        self.sub_factor = 3
        self.len_ = 0

    def set_array(self,arr):
        #Set, and preprocess
        tgt = arr
        self.len_ = len(tgt)
        self.arrs.append(arr)
        while len(tgt) > 1:
            tgt = self.maxify_one_array(tgt)
            self.arrs.append(tgt)

    def maxify_one_array(self,arr):
        sub_arr = []
        themax = float('-inf')
        for i,el in enumerate(arr):
            themax = max(el,themax)
            if i % self.sub_factor == self.sub_factor - 1:
                sub_arr.append(themax)
                themax = float('-inf')
        return sub_arr

    def query(self,l,r,level=None):
        if level is None:
            level = len(self.arrs)-1

        if r <= l:
            return None

        int_size = self.sub_factor ** level 

        lhs,mid,rhs = (float('-inf'),float('-inf'),float('-inf'))

        #Check if there's an imperfect match on the left hand side
        if l % int_size != 0:
            lnew = int(ceil(l/float(int_size)))*int_size
            lhs = self.query(l,min(lnew,r),level-1)
            l = lnew
        #Check if there's an imperfect match on the right hand side
        if r % int_size != 0:
            rnew = int(floor(r/float(int_size)))*int_size
            rhs = self.query(max(rnew,l),r,level-1)
            r = rnew

        if r > l:
            #Handle the middle elements
            mid = max(self.arrs[level][l/int_size:r/int_size])
        return max(max(lhs,mid),rhs)

try “segment tree” data structure
there are 2 step
build_tree() O(n)
query(int min, int max) O(nlogn)

http://en.wikipedia.org/wiki/Segment_tree

edit:

you guys just don’t read the wiki i sent!

this algorithm is:
– you traverse the array 1 time to build tree. O(n)
– next 100000000+ times you want to know max of any part of array, just call the query function. O(logn) for every query
– c++ implement here geeksforgeeks.org/segment-tree-set-1-range-minimum-query/
old algorithm is:
every query, just traverse the selected area and find.

so, if you gonna use this algorithm to process once, OK, it slower than old way.
but if you gonna process huge number of query(billion), it’s very efficient
you can generate text file like this, for test

line 1: 50000 random number from 0-1000000, split by ‘(space)'(it’s the array)
line 2: 2 random number from 1 to 50000, split by ‘(space)'(it’s the query)
…
line 200000: likes line 2, it’s random query too

this is the example problem, sorry but this is in vietnamese
http://vn.spoj.com/problems/NKLINEUP/
if you solve it by old way, you never pass.

You can achieve O(1) per query (with O(n log n) construction) using data structure called sparse table. For each power of 2 let’s save maximum for each segment of this length. Now given segment [l, r) you get maximum of maximums on [l+2^k) and [r-2^k,r) for appropriate k. They overlap but it’s OK

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị

Filed under: softwareengineering - @ 09:55

Thẻ: algorithms, array

Thiết kế website giá rẻ

Danh mục

Retrieving maximum value from a range in unsorted array