Citation
London, Palma Alise den Nijs (2017) Distributed Optimization and Data Market Design. Master's thesis, California Institute of Technology. doi:10.7907/Z9SX6B76. https://resolver.caltech.edu/CaltechTHESIS:05092017-112643697
Abstract
We consider algorithms for distributed optimization and their applications. In this thesis, we propose a new approach for distributed optimization based on an emerging area of theoretical computer science – local computation algorithms. The approach is fundamentally different from existing methodologies and provides a number of benefits, such as robustness to link failure and adaptivity to dynamic settings. Specifically, we develop an algorithm, LOCO, that given a convex optimization problem P with n variables and a “sparse” linear constraint matrix with m constraints, provably finds a solution as good as that of the best online algorithm for P using only O(log(n + m)) messages with high probability. The approach is not iterative and communication is restricted to a localized neighborhood. In addition to analytic results, we show numerically that the performance improvements over classical approaches for distributed optimization are significant, e.g., it uses orders of magnitude less communication than ADMM.
We also consider the operations of a geographically distributed cloud data market. We consider design decisions that include which data to purchase (data purchasing) and where to place or replicate the data for delivery (data placement). We show that a joint approach to data purchasing and data placement within a cloud data market improves operating costs. This problem can be viewed as a facility location problem, and is thus NP-hard. However, we give a provably optimal algorithm for the case of a data market consisting of a single data center, and then generalize the result from the single data center setting in order to develop a near-optimal, polynomial-time algorithm for a geo-distributed data market. The resulting design, Datum, decomposes the joint purchasing and placement problem into two subproblems, one for data purchasing and one for data placement, using a transformation of the underlying bandwidth costs. We show, via a case study, that Datum is near-optimal (within 1.6%) in practical settings.
Item Type: | Thesis (Master's thesis) | ||||||
---|---|---|---|---|---|---|---|
Subject Keywords: | distributed optimization, algorithms, data center | ||||||
Degree Grantor: | California Institute of Technology | ||||||
Division: | Engineering and Applied Science | ||||||
Major Option: | Computer Science | ||||||
Thesis Availability: | Public (worldwide access) | ||||||
Research Advisor(s): |
| ||||||
Thesis Committee: |
| ||||||
Defense Date: | 19 May 2017 | ||||||
Non-Caltech Author Email: | londonpalma (AT) gmail.com | ||||||
Funders: |
| ||||||
Record Number: | CaltechTHESIS:05092017-112643697 | ||||||
Persistent URL: | https://resolver.caltech.edu/CaltechTHESIS:05092017-112643697 | ||||||
DOI: | 10.7907/Z9SX6B76 | ||||||
Related URLs: |
| ||||||
Default Usage Policy: | No commercial reproduction, distribution, display or performance rights in this work are provided. | ||||||
ID Code: | 10158 | ||||||
Collection: | CaltechTHESIS | ||||||
Deposited By: | Palma London | ||||||
Deposited On: | 05 Jun 2017 22:05 | ||||||
Last Modified: | 28 Aug 2020 18:14 |
Thesis Files
|
PDF
- Final Version
See Usage Policy. 914kB |
Repository Staff Only: item control page