Citation
Azizan Ruhi, Navid (2021) Large-Scale Intelligent Systems: From Network Dynamics to Optimization Algorithms. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/j494-j572. https://resolver.caltech.edu/CaltechTHESIS:11182020-210226861
Abstract
The expansion of large-scale technological systems such as electrical grids, transportation networks, health care systems, telecommunication networks, the Internet (of things), and other societal networks has created numerous challenges and opportunities at the same time. These systems are often not yet as robust, efficient, sustainable, or smart as we would want them to be. Fueled by the massive amounts of data generated by all these systems, and with the recent advances in making sense out of data, there is a strong desire to make them more intelligent. However, developing large-scale intelligent systems is a multifaceted problem, involving several major challenges. First, large-scale systems typically exhibit complex dynamics due to the large number of entities interacting over a network. Second, because the system is composed of many interacting entities, that make decentralized (and often self-interested) decisions, one has to properly design incentives and markets for such systems. Third, the massive computational needs caused by the scale of the system necessitate performing computations in a distributed fashion, which in turn requires devising new algorithms. Finally, one has to create algorithms that can learn from the copious amounts of data and generalize well. This thesis makes several contributions related to each of these four challenges.
Analyzing and understanding the network dynamics exhibited in societal systems is crucial for developing systems that are robust and efficient. In Part I of this thesis, we study one of the most important families of network dynamics, namely, that of epidemics, or spreading processes. Studying such processes is relevant for understanding and controlling the spread of, e.g., contagious diseases among people, ideas or fake news in online social networks, computer viruses in computer networks, or cascading failures in societal networks. We establish several results on the exact Markov chain model and the nonlinear "mean-field" approximations for various kinds of epidemics (i.e., SIS, SIRS, SEIRS, SIV, SEIV, and their variants).
Designing incentives and markets for large-scale systems is critical for their efficient operation and ensuring an alignment between the agents' decentralized decisions and the global goals of the system. To that end, in Part II of this thesis, we study these issues in markets with non-convex costs as well as networked markets, which are of vital importance for, e.g., the smart grid. We propose novel pricing schemes for such markets, which satisfy all the desired market properties. We also reveal issues in the current incentives for distributed energy resources, such as renewables, and design optimization algorithms for efficient management of aggregators of such resources.
With the growing amounts of data generated by large-scale systems, and the fact that the data may already be dispersed across many units, it is becoming increasingly necessary to run computational tasks in a distributed fashion. Part III concerns developing algorithms for distributed computation. We propose a novel consensus-based algorithm for the task of solving large-scale systems of linear equations, which is one of the most fundamental problems in linear algebra, and a key step at the heart of many algorithms in scientific computing, machine learning, and beyond. In addition, in order to deal with the issue of heterogeneous delays in distributed computation, caused by slow machines, we develop a new coded computation technique. In both cases, the proposed methods offer significant speed-ups relative to the existing approaches.
Over the past decade, deep learning methods have become the most successful learning algorithms in a wide variety of tasks. However, the reasons behind their success (as well as their failures in some respects) are largely unexplained. It is widely believed that the success of deep learning is not just due to the deep architecture of the models, but also due to the behavior of the optimization algorithms, such as stochastic gradient descent (SGD), used for training them. In Part IV of this thesis, we characterize several properties, such as minimax optimality and implicit regularization, of SGD, and more generally, of the family of stochastic mirror descent (SMD). While SGD performs an implicit regularization, this regularization can be effectively controlled using SMD with a proper choice of mirror, which in turn can improve the generalization error.
Item Type: | Thesis (Dissertation (Ph.D.)) | ||||
---|---|---|---|---|---|
Subject Keywords: | Intelligent systems; networks; optimization; epidemics; markets; smart grid; distributed computation; machine learning; artificial intelligence | ||||
Degree Grantor: | California Institute of Technology | ||||
Division: | Engineering and Applied Science | ||||
Major Option: | Computing and Mathematical Sciences | ||||
Thesis Availability: | Public (worldwide access) | ||||
Research Advisor(s): |
| ||||
Thesis Committee: |
| ||||
Defense Date: | 25 August 2020 | ||||
Record Number: | CaltechTHESIS:11182020-210226861 | ||||
Persistent URL: | https://resolver.caltech.edu/CaltechTHESIS:11182020-210226861 | ||||
DOI: | 10.7907/j494-j572 | ||||
Related URLs: | |||||
ORCID: |
| ||||
Default Usage Policy: | No commercial reproduction, distribution, display or performance rights in this work are provided. | ||||
ID Code: | 14001 | ||||
Collection: | CaltechTHESIS | ||||
Deposited By: | Navid Azizan Ruhi | ||||
Deposited On: | 03 Dec 2020 20:46 | ||||
Last Modified: | 02 Nov 2021 00:06 |
Thesis Files
PDF
- Final Version
See Usage Policy. 3MB |
Repository Staff Only: item control page