Algorithms for non-uniform size data placement on parallel disks

TitleAlgorithms for non-uniform size data placement on parallel disks
Publication TypeJournal Articles
Year of Publication2006
AuthorsKashyap S, Khuller S
JournalJournal of Algorithms
Pagination144 - 167
Date Published2006/08//
ISBN Number0196-6774

We study an optimization problem that arises in the context of data placement in a multimedia storage system. We are given a collection of M multimedia objects (data items) that need to be assigned to a storage system consisting of N disks d 1 , d 2 , … , d N . We are also given sets U 1 , U 2 , … , U M such that U i is the set of clients seeking the ith data item. Data item i has size s i . Each disk d j is characterized by two parameters, namely, its storage capacity C j which indicates the maximum total size of data items that may be assigned to it, and a load capacity L j which indicates the maximum number of clients that it can serve. The goal is to find a placement of data items to disks and an assignment of clients to disks so as to maximize the total number of clients served, subject to the capacity constraints of the storage system.We study this data placement problem for homogeneous storage systems where all the disks are identical. We assume that all disks have a storage capacity of k and a load capacity of L. Previous work on this problem has assumed that all data items have unit size, in other words s i = 1 for all i. Even for this case, the problem is NP-hard. For the case where s i ∈ { 1 , … , Δ } for some constant Δ, we develop a polynomial time approximation scheme (PTAS). This result is obtained by developing two algorithms, one that works for constant k and one that works for arbitrary k. The algorithm for arbitrary k guarantees that a solution where at least ( ( k − Δ ) ( k + Δ ) ) ( 1 − 1 ( 1 + k ( 2 Δ ) ) 2 ) -fraction of all clients are assigned to a disk (under certain assumptions). In addition we develop an algorithm for which we can prove tight bounds when s i ∈ { 1 , 2 } . In fact, we can show that a ( 1 − 1 ( 1 + ⌊ k / 2 ⌋ ) 2 ) -fraction of all clients can be assigned (under certain natural assumptions), regardless of the input distribution.