Test and Evaluation of Autonomous Systems: Reactive Test Synthesis and Task-Relevant Evaluation of Perception

Citation

Badithela, Apurva Srinivas (2024) Test and Evaluation of Autonomous Systems: Reactive Test Synthesis and Task-Relevant Evaluation of Perception. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/e8qz-rd26. https://resolver.caltech.edu/CaltechTHESIS:06022024-014038700

Abstract

Autonomous robotic systems have potential for profound impact on our society -- legged and wheeled robots for search and rescue missions, drones for wildfire management, self-driving cars for improving mobility, and robotic space missions for exploration and repair of spacecraft. The complexity of these systems implies that formal guarantees during the design phase alone is not sufficient; mainstream deployment of these systems requires principled frameworks for test and evaluation, and verification and validation. This thesis studies two such challenges to mainstream deployment of these systems.

First, we consider the problem of evaluating perception models in a manner relevant to the system-level specification and the downstream planner. Perception and planning modules are often designed under different computational and mathematical paradigms. This talk will focus on evaluating models for classification and detection tasks, and leverages confusion matrices which are popularly used in computer vision to evaluate object detection models to derive probabilistic guarantees at the system-level. However, not all perception errors are equally safety-critical, and traditional confusion matrices account for all objects equally. Thus, task-relevant metrics such as proposition labeled confusion matrices are introduced. These are constructed by identifying propositional formulas relevant to the downstream planning logic and the system-level specification, and result in less conservative system-level guarantees. Using this analysis, fundamental tradeoffs in perception models are reflected in the tradeoffs of probabilistic guarantees. This framework is illustrated on a car-pedestrian example in simulation, and the confusion matrices are constructed from state-of-the-art detection models evaluated on the nuScenes dataset.

Second, we consider the problem of automatically synthesizing tests for autonomous robotic systems. These systems reason over both discrete (e.g., navigate left or right around an obstacle) and continuous variables (e.g., continuous trajectories). This talk presents a flow-based approach for test environment synthesis which handles discrete variables and is also reactive to the system under test. Reactivity is important to account for uncertainties in system modeling, and to adapt to system behavior without knowledge of the system controller. These tests are synthesized from high-level specifications of desired behavior. Though the problem is shown to be NP-hard, a flow-based mixed-integer linear program formulation is used that scales well to medium-sized examples (e.g., >10,000 integer variables). The test environment can consist of static and reactive obstacles as well as dynamic test agents, whose strategies are synthesized to match the solution of the flow-based optimization. The overview of the approach is as follows. First, principles of automata theory are used to translate the high-level system and test objectives, and the non-deterministic abstraction of the system into a network flow optimization. The solution of this optimization is then parsed into GR(1) formulas in linear temporal logic. This GR(1) formula is used to synthesize reactive strategies of a dynamic test agent in a counterexample-guided fashion. We provide guarantees that the synthesized test strategy will realize the desired test behavior under the assumption of a well-designed system, the test strategy is reactive and least-restrictive,. This framework is illustrated on several simulation and hardware experiments with quadrupeds, showing promise towards a layered approach to test and evaluation.

Item Type:

Thesis (Dissertation (Ph.D.))

Subject Keywords:

Test and Evaluation, Control Theory, Formal Methods, Robotics,

Degree Grantor:

California Institute of Technology

Division:

Engineering and Applied Science

Major Option:

Control and Dynamical Systems

Awards:

CMS and IST Gradient for Change Award, 2022.

Thesis Availability:

Public (worldwide access)

Research Advisor(s):

Murray, Richard M.

Thesis Committee:

Ames, Aaron D.
Chandy, K. Mani
Burdick, Joel Wakeman (chair)
Wongpiromsarn, Tichakorn
Murray, Richard M.

Defense Date:

21 May 2024

Funders:

Funding Agency	Grant Number
Air Force Office of Scientific Research (AFOSR)	FA9550-22-1-0333
Air Force Office of Scientific Research (AFOSR)	FA9550-19-1-0302

Record Number:

CaltechTHESIS:06022024-014038700

Persistent URL:

https://resolver.caltech.edu/CaltechTHESIS:06022024-014038700

DOI:

10.7907/e8qz-rd26

Related URLs:

URL	URL Type	Description
https://arxiv.org/abs/2404.09888	arXiv	Article used in Chapters 3 and 4.
https://arxiv.org/pdf/2303.1775	arXiv	Preprint. Part of the article used in Chapter 2.
https://ieeexplore.ieee.org/document/10342465	Publisher	Article used in Chapter 2.
https://ieeexplore.ieee.org/document/10160841	Publisher	Article used in Chapter 4.
https://link.springer.com/chapter/10.1007/978-3-031-33170-1_17	Publisher	Article used in Chapter 5.
https://link.springer.com/chapter/10.1007/978-3-031-06773-0_7	Publisher	Article used in Chapter 5.
https://ieeexplore.ieee.org/document/9683611	Publisher	Article used in Chapter 2.
https://arxiv.org/pdf/2108.05911	arXiv	Article used in Chapter 3.

ORCID:

Author	ORCID
Badithela, Apurva Srinivas	0000-0002-9788-2702

Default Usage Policy:

No commercial reproduction, distribution, display or performance rights in this work are provided.

ID Code:

16465

Collection:

CaltechTHESIS

Deposited By:

Apurva Badithela

Deposited On:

04 Jun 2024 20:55

Last Modified:

12 Jun 2024 22:48

Thesis Files

	PDF (Full Thesis) - Final Version See Usage Policy. 96MB
	PDF (Chapter 1) - Final Version See Usage Policy. 12MB
	PDF (Chapter 2) - Final Version See Usage Policy. 4MB
	PDF (Chapter 3) - Final Version See Usage Policy. 1MB
	PDF (Chapter 4) - Final Version See Usage Policy. 82MB
	PDF (Chapter 5) - Final Version See Usage Policy. 3MB
	PDF (Chapter 6) - Final Version See Usage Policy. 913kB

Repository Staff Only: item control page