Please use this identifier to cite or link to this item: http://theses.ncl.ac.uk/jspui/handle/10443/6410
Full metadata record
DC FieldValueLanguage
dc.contributor.authorCrane, Kirsten Nicole-
dc.date.accessioned2025-03-21T09:44:19Z-
dc.date.available2025-03-21T09:44:19Z-
dc.date.issued2024-
dc.identifier.urihttp://hdl.handle.net/10443/6410-
dc.descriptionPhD Thesisen_US
dc.description.abstractUsing an autonomous underwater vehicle to film marine animals such as dolphins in their natural habitat can greatly aid monitoring, health assessment and animal behaviour research. Having a vehicle autonomously follow and orient toward a species of interest, without the need for tagging, presents a challenging visual active tracking (VAT) problem using image data from the onboard camera. This thesis investigates model-free deep reinforcement learning (DRL) algorithm Soft Actor Critic (SAC) as a potential solution. The utility of this approach is demonstrated in simulation. A follow-up robotics project would then need to look at integrating the simulation-trained control policy with the real vehicle. DRL was selected given that it can support accurate, real-time tracking without needing to model the complexities of a marine environment. In the VAT literature, research can be divided into end-to-end and task-separated solutions, based on whether or not the state estimation and control sub-tasks are jointly optimised. The benefit of joint optimisation is that state estimation can respond to control performance, and the control can adapt to imperfect state estimation. The challenge is that this requires a network large enough for learning rich representations, whilst also needing to limit the number of network parameters for the difficult credit assignment problem faced by the DRL agent. This thesis explores an approach to VAT which is end-to-end but alleviates some of the burden by learning the majority of perceptual skills prior to agent training, with a separate model - a variational autoencoder (VAE). Furthermore, the task-relevance of these perceptual skills is ensured through the use of a multi-part loss function fed by three auxiliary tasks of target state prediction. This approach to a constrained VAE was presented by Bonatti et al. (2020) in the aerial navigation space, upstream of imitation learning. This thesis extends the approach to DRL and VAT, with a new framework called T2FO (tracking with task-relevant feature observations). T2FO achieves mean episodic return of 2,057 from a possible 3,000, across 100 inference runs of the trained policy. The framework outperforms three baseline SAC policies trained with raw image observations (1,049), unconstrained VAE features (1,198) and target state predictions from the auxiliary networks (1,987). Neither agent training nor VAE training were possible without first developing a cus tom environment for the custom problem. This thesis additionally presents three environ ments developed using the commercial game engine Unity and Open AI’s widely used library Gym: a toy environment CubeTrack, a car environment DonkeyTrack, and an application-focused underwater environment SWiMM DEEPeR. For supplementary videos see https://www.youtube.com/channel/UCA4fgSfe2IctRv5N-Gr0OrQ.en_US
dc.language.isoenen_US
dc.publisherNewcastle Universityen_US
dc.titleVisual Active Tracking in Simulation with Task-Relevant Features and Deep Reinforcement Learningen_US
dc.typeThesisen_US
Appears in Collections:School of Computing

Files in This Item:
File Description SizeFormat 
CraneKN2024.pdfThesis31.61 MBAdobe PDFView/Open
dspacelicence.pdfLicence43.82 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.