Hi
@JamD ,
This is something that piqued my interest and since I work on reinforcement learning/integral reinforcement learning (RL/IRL) I can shed some light on this. What you have written above is partially true. I will elaborate where it is wrong. You can have a look at my papers on arxiv for detailed discussion of the proof.
I would like to start off by saying that there is indeed room for implementation of "some" of the notions of AI on lower level flight control like lets say attitude control, velocity control "with guarantees". Now what I am alluding to is the fact that quite recently (in last 8-10 years), the researchers have been able to port some of the algorithms developed by computer algorithmists to control theory. What this entails is developing requisite mathematical framework for those algorithms so that those algorithms adhere to Lyapunov or extended versions of Lyapunov stability. Now, the problem here is, instead of having asymptotic stability-- wherein the error (difference between actual and desired value) goes to 0, we have what is known as uniform ultimate boundedness (UUB)-stability-- this is when the error goes to a "neighborhood" of 0. So, yes, there are "guarantees" that the algorithm will be stable (although not quite what you were expecting)!
Now, AI in control theory is a very vast world and I believe that I can not describe it all here, but I will give it a try. The most essential element of any AI is what is known as "neural networks" (NNs). These NNs are nothing but glorified approximators. They approximate "smooth" functions. In control theory the function that we are interested to approximate is usually 'unknown'. Generally, NNs are employed in two very distinct operations in control theory:
(A) System Identification:
(1) With structure-- i.e., when we know the dynamics or the differential equation but not the parameters like lets say mass, inertia or aerodynamic coefficients. In this case, the function appearing in your dynamics form the regressor vector/ functions in the hidden layer of the NN.
(2) Without structure-- i.e., when we do not even know the differential equation governing the evolution of the states. In this case generic functions such as RBF, tanh, logistic function etc are utilized.
The main challenge is to come up with parameter update laws for the NN so that you are able to reliably approximate the unknown functions with some version Lyapunov stability.
(B) Control: In control, there are again two very different paradigms in which NNs are being utilized
(1) Inversion frameworks-- like Sliding mode controller (SMC), Feedback linearization etc (Disclaimer: The nonlinear dynamics should be affine-in control form!).
In this case, the NNs are utilized to either approximate the unknown dynamics on the go or the equivalent control appearing in SMC framework. The major drawback of this approach is the complexity involved to make the "approximated version of control coupling dynamics" invertible! People have come up with bizarre mathematics to achieve that. For instance generalized inverses (GI) are used instead of your normal matrix inversion. But when you use GI you inherently get some error-- because this is not the true inverse and to compensate that, people then came up with this notion of using Nussbaum gains (to predict the sign of the control coupling matrix) and then using a robustifying term to compensate for the errors induced by GI.
(2) Reinforcement Learning (also known as Adaptive dynamic programming (ADP) among control theorists!)--
This application of NN does not involve taking inverses of the control coupling matrix. So where does the utility of the NN lie in this case. Well JamD, in this case, you use NN to approximate the solution of the HJB equation which is a nonlinear PDE! Again as usual, the challenge lies in finding "how" to update the NN weights so that your NN is able to faithfully approximate the solution of HJB with reasonable accuracy while satisfying some version of Lyapunov stability (the second part is what makes it all interesting!).
There are two major ways in which RL in employed- On-policy/ Off-policy methods.
People have come up with actor-critic, critic-only, actor-critic-disturbance frameworks to solve either regulation or tracking problem for nonlinear systems. Once again, all of these have "guarantees" in terms of the size of the residual set-- what is also known as UUB-set in which the state trajectories will eventually come to and stay there within that set.
PS- I was specifically talking about online methods. Which means, the NN evolve in real-time on the go as the aircraft or UAV flies! Also note, in these cases, NNs are used in feedback and hence they dont require huge amounts of data to train. They are supposed to work with the live stream of sensor data and evolve. The dynamics governing the evolution of the weights is what you might call as "parameter update law".