A Caltech Library Service

Distributed Optimization and Data Market Design


London, Palma Alise den Nijs (2017) Distributed Optimization and Data Market Design. Master's thesis, California Institute of Technology. doi:10.7907/Z9SX6B76.


We consider algorithms for distributed optimization and their applications. In this thesis, we propose a new approach for distributed optimization based on an emerging area of theoretical computer science – local computation algorithms. The approach is fundamentally different from existing methodologies and provides a number of benefits, such as robustness to link failure and adaptivity to dynamic settings. Specifically, we develop an algorithm, LOCO, that given a convex optimization problem P with n variables and a “sparse” linear constraint matrix with m constraints, provably finds a solution as good as that of the best online algorithm for P using only O(log(n + m)) messages with high probability. The approach is not iterative and communication is restricted to a localized neighborhood. In addition to analytic results, we show numerically that the performance improvements over classical approaches for distributed optimization are significant, e.g., it uses orders of magnitude less communication than ADMM.

We also consider the operations of a geographically distributed cloud data market. We consider design decisions that include which data to purchase (data purchasing) and where to place or replicate the data for delivery (data placement). We show that a joint approach to data purchasing and data placement within a cloud data market improves operating costs. This problem can be viewed as a facility location problem, and is thus NP-hard. However, we give a provably optimal algorithm for the case of a data market consisting of a single data center, and then generalize the result from the single data center setting in order to develop a near-optimal, polynomial-time algorithm for a geo-distributed data market. The resulting design, Datum, decomposes the joint purchasing and placement problem into two subproblems, one for data purchasing and one for data placement, using a transformation of the underlying bandwidth costs. We show, via a case study, that Datum is near-optimal (within 1.6%) in practical settings.

Item Type:Thesis (Master's thesis)
Subject Keywords:distributed optimization, algorithms, data center
Degree Grantor:California Institute of Technology
Division:Engineering and Applied Science
Major Option:Computer Science
Thesis Availability:Public (worldwide access)
Research Advisor(s):
  • Wierman, Adam C.
Thesis Committee:
  • None, None
Defense Date:19 May 2017
Non-Caltech Author Email:londonpalma (AT)
Funding AgencyGrant Number
NSF Graduate Research Fellowship ProgramUNSPECIFIED
Record Number:CaltechTHESIS:05092017-112643697
Persistent URL:
Related URLs:
URLURL TypeDescription adapted and extended for thesis.
Default Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:10158
Deposited By: Palma London
Deposited On:05 Jun 2017 22:05
Last Modified:28 Aug 2020 18:14

Thesis Files

PDF - Final Version
See Usage Policy.


Repository Staff Only: item control page