CaltechTHESIS
  A Caltech Library Service

Test and Evaluation of Autonomous Systems: Reactive Test Synthesis and Task-Relevant Evaluation of Perception

Citation

Badithela, Apurva Srinivas (2024) Test and Evaluation of Autonomous Systems: Reactive Test Synthesis and Task-Relevant Evaluation of Perception. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/e8qz-rd26. https://resolver.caltech.edu/CaltechTHESIS:06022024-014038700

Abstract

Autonomous robotic systems have potential for profound impact on our society -- legged and wheeled robots for search and rescue missions, drones for wildfire management, self-driving cars for improving mobility, and robotic space missions for exploration and repair of spacecraft. The complexity of these systems implies that formal guarantees during the design phase alone is not sufficient; mainstream deployment of these systems requires principled frameworks for test and evaluation, and verification and validation. This thesis studies two such challenges to mainstream deployment of these systems.

First, we consider the problem of evaluating perception models in a manner relevant to the system-level specification and the downstream planner. Perception and planning modules are often designed under different computational and mathematical paradigms. This talk will focus on evaluating models for classification and detection tasks, and leverages confusion matrices which are popularly used in computer vision to evaluate object detection models to derive probabilistic guarantees at the system-level. However, not all perception errors are equally safety-critical, and traditional confusion matrices account for all objects equally. Thus, task-relevant metrics such as proposition labeled confusion matrices are introduced. These are constructed by identifying propositional formulas relevant to the downstream planning logic and the system-level specification, and result in less conservative system-level guarantees. Using this analysis, fundamental tradeoffs in perception models are reflected in the tradeoffs of probabilistic guarantees. This framework is illustrated on a car-pedestrian example in simulation, and the confusion matrices are constructed from state-of-the-art detection models evaluated on the nuScenes dataset.

Second, we consider the problem of automatically synthesizing tests for autonomous robotic systems. These systems reason over both discrete (e.g., navigate left or right around an obstacle) and continuous variables (e.g., continuous trajectories). This talk presents a flow-based approach for test environment synthesis which handles discrete variables and is also reactive to the system under test. Reactivity is important to account for uncertainties in system modeling, and to adapt to system behavior without knowledge of the system controller. These tests are synthesized from high-level specifications of desired behavior. Though the problem is shown to be NP-hard, a flow-based mixed-integer linear program formulation is used that scales well to medium-sized examples (e.g., >10,000 integer variables). The test environment can consist of static and reactive obstacles as well as dynamic test agents, whose strategies are synthesized to match the solution of the flow-based optimization. The overview of the approach is as follows. First, principles of automata theory are used to translate the high-level system and test objectives, and the non-deterministic abstraction of the system into a network flow optimization. The solution of this optimization is then parsed into GR(1) formulas in linear temporal logic. This GR(1) formula is used to synthesize reactive strategies of a dynamic test agent in a counterexample-guided fashion. We provide guarantees that the synthesized test strategy will realize the desired test behavior under the assumption of a well-designed system, the test strategy is reactive and least-restrictive,. This framework is illustrated on several simulation and hardware experiments with quadrupeds, showing promise towards a layered approach to test and evaluation.

Item Type:Thesis (Dissertation (Ph.D.))
Subject Keywords:Test and Evaluation, Control Theory, Formal Methods, Robotics,
Degree Grantor:California Institute of Technology
Division:Engineering and Applied Science
Major Option:Control and Dynamical Systems
Awards:CMS and IST Gradient for Change Award, 2022.
Thesis Availability:Public (worldwide access)
Research Advisor(s):
  • Murray, Richard M.
Thesis Committee:
  • Ames, Aaron D.
  • Chandy, K. Mani
  • Burdick, Joel Wakeman (chair)
  • Wongpiromsarn, Tichakorn
  • Murray, Richard M.
Defense Date:21 May 2024
Funders:
Funding AgencyGrant Number
Air Force Office of Scientific Research (AFOSR)FA9550-22-1-0333
Air Force Office of Scientific Research (AFOSR)FA9550-19-1-0302
Record Number:CaltechTHESIS:06022024-014038700
Persistent URL:https://resolver.caltech.edu/CaltechTHESIS:06022024-014038700
DOI:10.7907/e8qz-rd26
Related URLs:
URLURL TypeDescription
https://arxiv.org/abs/2404.09888arXivArticle used in Chapters 3 and 4.
https://arxiv.org/pdf/2303.1775arXivPreprint. Part of the article used in Chapter 2.
https://ieeexplore.ieee.org/document/10342465PublisherArticle used in Chapter 2.
https://ieeexplore.ieee.org/document/10160841PublisherArticle used in Chapter 4.
https://link.springer.com/chapter/10.1007/978-3-031-33170-1_17PublisherArticle used in Chapter 5.
https://link.springer.com/chapter/10.1007/978-3-031-06773-0_7PublisherArticle used in Chapter 5.
https://ieeexplore.ieee.org/document/9683611PublisherArticle used in Chapter 2.
https://arxiv.org/pdf/2108.05911arXivArticle used in Chapter 3.
ORCID:
AuthorORCID
Badithela, Apurva Srinivas0000-0002-9788-2702
Default Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:16465
Collection:CaltechTHESIS
Deposited By: Apurva Badithela
Deposited On:04 Jun 2024 20:55
Last Modified:12 Jun 2024 22:48

Thesis Files

[img] PDF (Full Thesis) - Final Version
See Usage Policy.

96MB
[img] PDF (Chapter 1) - Final Version
See Usage Policy.

12MB
[img] PDF (Chapter 2) - Final Version
See Usage Policy.

4MB
[img] PDF (Chapter 3) - Final Version
See Usage Policy.

1MB
[img] PDF (Chapter 4) - Final Version
See Usage Policy.

82MB
[img] PDF (Chapter 5) - Final Version
See Usage Policy.

3MB
[img] PDF (Chapter 6) - Final Version
See Usage Policy.

913kB

Repository Staff Only: item control page