Capital Area Theory Seminar: Spring 2000

The Capital Area Theory Symposia is an NSF sponsored series of symposia in theoretical computer science bringing computer scientists from around the world to the Capital area. The Symposia are given at the University of Maryland in cooperation with the Computer Science Department and UMIACS. NSF support under grant CCR-9732907 is gratefully acknowledged.

Talks

If you'd like to meet the speaker, send e-mail to gasarch@cs.umd.edu.

Abstracts are available.

Contact: Send email to gasarch@umiacs.umd.edu for additional information. To join the theory-local mailing list, send email to smith@cs.umd.edu.

Talks from previous semesters

Other cats

Abstracts

The Non-Recursive Power of Erroneous Computation

Andreas Jakoby

We present two new complexity classes which are based on a complexity class C and an error probability function F. The first, F-Err-C, reflects the (weak) feasibility of problems that can be computed within the error bound F. As a more adequate measure to investigate lower bounds we introduce F-Err_io-C where the error is infinitely often bounded by the function F. These definitions generalize existing models of feasible erroneous computations and cryptographic intractability.

We identify meaningful bounds for the error function and derive new diagonalizing techniques. These techniques are applied to known time hierarchies to investigate the influence of error bound. It turns out that in the limit a machine with slower running time cannot predict the diagonal language within a significantly smaller error probability than 1/2.

Further, we investigate two classical non-recursive problems: the halting problem and the Kolmogorov complexity problem. We present strict lower bounds proving that any heuristic algorithm claiming to solve one of these problems makes unrecoverable errors with constant probability. Up to now it was only known that infinitely many errors will occur.


Caching for Broadcast Disk Systems

Vincenzo Liberatore

UMIACS

Broadcast disks are used to disseminate data from a server to clients. Broadcast disk dissemination is initiated by the server and follows a periodic schedule. We investigate caching strategies for broadcast disk clients. We describe a new replacement strategy, prove that is optimally competitive,and show that it outperforms previous online caching strategies on both synthetic and Web workloads.


Statistical Zero-Knowledge: A Survey of Recent Developments

Salil Vadhan

Zero-knowledge proofs, introduced by Goldwasser, Micali, and Rackoff in 1985, are fascinating constructs which enable one party to convince another of an assertion without revealing anything other than the validity of the assertion. *Statistical* zero-knowledge proofs are a particular type of such proofs in which the condition that the verifier learns nothing is interpreted in a strong statistical sense. In this talk, we survey a number of recent results which have given us a much more refined understanding of statistical zero-knowledge proofs and the class SZK of languages ("assertions") which possess such proofs.

Particular items of focus in this survey are

- The role of Okamoto's theorem (1996) that any SZK proof can be converted into a "public coin" one in facilitating these recent improvements in our understanding of SZK.

- The use of "complete problems" to obtain new characterizations of SZK and to reduce the study of the class to a single problem (as first seen in [Sahai, Vadhan, 1997]).

We illustrate the benefits of these two tools, by surveying some of the results that have been obtained:

- Constructing statistical zero-knowledge proofs for complex assertions built out of simpler assertions already shown to be in SZK [Sahai, Vadhan, 1997].

- Converting honest verifier SZK proofs to any-verifier SZK proofs [Goldreich, Sahai, Vadhan, 1998].

- Extending the theory to "noninteractive" SZK proofs [De Santis, Di Crescenzo, Persiano, Yung, 1998] and using this to relate SZK to "noninteractive" SZK [Goldreich, Sahai, Vadhan, 1999].


Private Information Retrieval: Protocols and Complexity

Tal Malkin

Private information retrieval (PIR) is a relatively new line of research in cryptography, focusing on methods for extracting information from a database while preventing the owner of the database from learning which record is being retrieved (that is, guaranteeing user privacy). In this talk we survey several results in this area, addressing issues of efficiency, security, and complexity. More specifically, after introducing the basic PIR model, we explore the following aspects:

- We study the cryptographic assumptions necessary to implement single server PIR with low communication. One of the implications of our result is that PIR can be used to implement secure protocols for any multi-party computation.

- We introduce the model of symmetric PIR (SPIR) where in addition to the user privacy, data privacy is also guaranteed. We show how to efficiently transform PIR protocols into SPIR protocols, for both the information-theoretic and computational settings.

- We introduce the random server model for PIR, and show how to use it to solve security and efficiency problems inherent in all previous PIR solutions.

The talk is based on several joint works, all done while the speaker was at the Massachusetts Institute of Technology.


Approximation algorithms in computational geometry.

Sariel Har-Peled

Duke University

Geometric approximation algorithms had emerged in recent years as a practical approach in solving problems where the exact algorithms proved to be infeasible. The combinatorial and numerical intractability in implementing exact geometric algorithms had further increased their attractiveness. As such, geometric approximation algorithms are being used in several areas, including graphics, geometric information systems, databases, robotics and simulations.

In this talk, we will present several geometric approximation algorithms. Algorithms for shortest path on polyhedral surfaces, minimum volume bounding boxes in 3d, and multiresolution


Two design changes for genetic mapping projects: theoretical justification and experimental results

Daniel Brown

Cornell

Genetic mapping experiments discover the location of many sites, or "markers," on the genome of a species. Some of these experiments are extremely expensive, computationally intense, and time consuming, and involve millions of dollars, tens of thousands of markers, and years of experimental time.

We propose two changes to these experiments: first, to perform much of the experimentation on only a sample of an experimental population, and second, to use a different method when determining the location of new markers. In the first of these changes, we model the problem of selecting a good mapping sample as a discrete optimization problem, based on the k-center problem. We discuss several heuristic methods for choosing good samples, including linear programming with randomized rounding and simpler greedy methods, and show results for several existing experimental populations.

In the second of these changes, we model the problem of placing new markers as the problem of locating them into "bins" of the genome, which are terminated by biological breakpoints. This approach differs markedly from existing methods, which instead try to determine the correct order of all of the new markers from all possible permutations. Our approach has the advantages that it is very accurate and computationally and experimentally feasible. We show preliminary results which demonstrate that our methods are of high quality and highly compatible with the sample selection approaches we present.

Our two changes offer the possibility of more accurate, less expensive, and faster genetic mapping experiments.

Joint work with Todd Vision, David Shmoys, Steve Tanksley and Rick Durrett


Approximation Algorithms for Facility Location Problems.

Sudipto Guha

Stanford University

The Facility Location problem has been one of the most studied problems in the Operations Research literature and recently has received a lot of attention from the Computer Science community. This problem and its variant, the K-Median problem, have been used in a wide range of applications including web-caching, clustering and so forth. However both these problems lie in a class of NP-Hard optimization problems and elude efficient solutions. A recent theme of research for such problems has been to find efficient heuristics which provide solutions that are close to optimal. In this talk we will investigate approximation algorithms for these problems. We will present various approaches and some applications of these approximation algorithms and relevant schemes in clustering large datasets.


On the Relationship between Theoretical and Experimental Approaches to NP-Hard Scheduling Problems

R. N. Uma

Polytechnic

We discuss, from both a theoretical and a practical perspective, the relationship between linear-programming based lower bounds and combinatorial lower bounds for some scheduling problems. We evaluate the performance of heuristics based on these lower bounds. Finally we discuss the implications for real-life scheduling problems based on experiments with actual application data.

(This is joint work with Martin W.P.Savelsbergh (Georgia Institute of Technology), Joel Wein (Akamai Technologies and Polytechnic University) and David P. Williamson (IBM T.J. Watson Research Center).)


Approximating general covering and network design problems

Goran Konjevood

Carnegie Mellon

We study network design problems that generalize standard connectivity requirements by allowing choice among a set of terminals. An example is the group Steiner problem, where subsets of vertices in a weighted graph are given and the goal is to find a minimum-weight connected subgraph which contains at least one vertex of each subset (group). Most of these problems include the classical set covering problem as a (very) special case. For instance, the group Steiner problem on a star is equivalent to set cover.

We provide approximation algorithms and/or hardness of approximation results for the following problems: group Steiner, covering Steiner (generalization of group Steiner where each group also has a coverage requirement that specifies the minimum number of vertices of the group to be chosen; this problem also includes as special cases the k-MST and set multicover problems), red-blue set cover (a finite set system where every element is colored red or blue is given, with the goal of finding a cover for blue elements which covers the least possible number of red elements).


Scheduling Data Delivery in Wireless Servers.

Muthu

AT&T Labs -- Research

BACKGROUND: There is increasing wireless support for data networks with the abundance of mobile clients (palm/hand-held computers, smart devices) as well as the rise in wireless ``data-centric'' applications (traffic information systems, stock/sports tickers and wireless internet access). This requires support infrastructure via data servers such as portable kiosks, infostations, etc. for timely and efficient delivery of information.

QUESTIONS: We focus on efficient scheduling of data delivery to improve responsiveness of wireless data servers. What is the measure of responsiveness for data delivery? Traditional measures are related to response time, but stretch (slowdown) may be more relevant. What is the mechanism for data delivery? Both unicasting and broadcasting are relevant. multiple requests simultaneously, single or multiple channel, etc.? Many different scheduling problems arise, some quite novel.

THIS TALK: We will present new online and offline algorithms for scheduling problems in this context, mainly focused on the stretch measure for scheduling data delivery. This will be the bulk of the talk. We will also present experimental results comparing the different algorithms on delivering web data. Many open issues need to be understood before a ultimate solution emerges for data delivery in wireless servers. We will also present a list of open scheduling problems that are likely to be relevant.


Chernoff Bounds and Equitable Colorings

Sriram Pemmaraju

Department of Mathematics, IIT Bombay

An equitable coloring is a graph coloring in which any two color classes differ in size by at most 1. A deep result of Hajnal and Szemeredi shows that any graph with maximum vertex degree d has a (d+1)-equitable coloring. The Chernoff-Hoeffding bounds are sharp upper bounds on the tails of the distribution of the sum of independent random variables. I will describe how the Hajnal-Szemeredi result can be used to extend Chernoff-Hoeffding bounds to the sum of identical random variables that exhibit a substantial amount of correlatedness. This novel connection implies that sharper Chernoff-Hoeffding bounds can be obtained by showing the existence of smaller equitable or proportional graph colorings. I will end the talk with a conjecture on small proportional colorings whose truth will lead to an extremely sharp Chernoff-Hoeffding type bound.


The Communication Complexity of Elimination.

Bill Gasarch

UMD

Say Alice has x \in {0,1}^n and Bob has y\in {0,1}^n and they want to see if x=y. But the catch is that they want to communicate as few bits as possible. Communication Complexity studies problems like this. This problem REQUIRES n+1 bits to be communicated (Alice sends x to Bob, Bob sends back a 1 if x=y and a 0 otherwise.) What if Alice has k strings x_1,x_2,...,x_k \in {0,1}^n and Bob has k strings y_1,y_2,...,y_k \in {0,1}^n, and they want to know does x_1=y_1?, does x_2=y_2?, ..., does x_k=y_k? (so they to determine k bits). They COULD do this with kn+k bits. Can they do better? It is known that they cannot.

All of the above could have been asked with a function f: {0,1}^n \times {0,n}^n --> {0,1} instead of EQUALITY.

We look at a related problem. Let f be as above. What if Alice has k strings x_1,x_2,...,x_k \in {0,1}^n and Bob has k strings y_1,y_2,...,y_k \in {0,1}^n, and they want to ELIMINATE one possiblity for f(x_1,y_1)f(x_2,y_2)...f(x_k,y_k). That is, they want to produce b\in {0,1}^k such that f(x_1,y_1)f(x_2,y_2)...f(x_k,y_k) NOTEQUAL b.

We investigate how hard this is for several functions f.

NOTE: This talk requirs no prior knowledge. NOTHING will be proven in detail!

(Joint work with: Ambainis, Buhrman, Kalyanasundaram, Torenvliet).


Approximation Schemes for Average Weighted Completion Time Scheduling

Chandra Chekuri

Two well studied objectives in scheduling theory are makespan (the time to finish all jobs in a schedule) and average (weighted) completion time (sum of (weighted) completion times/(number of jobs)). The latter objective is the appropriate one when the throughput of job completions is the the measure of interest. We address the problem of scheduling $n$ independent jobs on a set of $m$ parallel machines when jobs have release dates. We seek to minimize average weighted completion time. We present the first known polynomial time approximation schemes for several variants. Our schemes are efficient: for all variants the running time for a $(1+\eps)$ approximation is of the form $f(1/\eps,m)poly(n)$. In this talk we will present a PTAS for the case of a single machine and briefly the case of parallel identical machines. Time permitting we will sketch the ideas for the other variants. This is joint work with Sanjeev Khanna (U. Pennsylvania).


Time-Space Tradeoffs for Satisfiability

Dieter van Melkebeek

DIMACS

Satisfiability is the problem of deciding whether a given propositional formula has a satisfying assignment. It is the seminal NP-complete problem and is of major practical importance. Although we expect satisfiability to take exponential time in the worst case, we do not know how to prove that no linear-time algorithm exists on a general-purpose random-access machine.

In recent years we have seen a new approach to establish superlinear time lower bounds for machines that only use a small amount of space. We will survey the underlying ideas and present the strongest result of this type so far: Satisfiability cannot be solved on general-purpose random-access machines running in n^{1.618} time and subpolynomial space. In general, for any constant c less than the golden ratio, we prove that satisfiability cannot be solved in n^c time and n^d space for some positive constant d depending on c.

We obtain these results by establishing time-space tradeoffs for nondeterministic linear time. The same technique applies to higher levels of the polynomial-time hierarchy. It also yields new lower bounds on conondeterministic versus nondeterministic computation.

This is joint work with Lance Fortnow.


Adrian Vetta

The task of finding small spanning subgraphs of a prescibed connectivity is a fundamental problem in network optimization. I will present improved approximation algorithms for the following basic problems; the minimum 2-edge connected and the minimum 2-vertex connected subgraphs in undirected graphs and the minimum strongly connected subgraph in directed graphs. The performance guarantees obtained are 4/3, 4/3 and 3/2 respectively. The first result verifies one implication of the well-known 4/3-Conjecture for the Travelling Salesman Problem with respect to the subtour constraints. The basic tools used in the algorithms are techniques from matching theory and decomposition theorems that allow us to concentrate upon relatively simple graphs. Work with Santosh Vempala.


Vincenzo Liberatore / vliberatore@acm.org