Selected Degree Projects/Dissertations

You can find some typical degree projects/dissertations by the students who worked with us in the last few years. All rights of the documents and the related code and demonstrations from this page will be reserved by the original authors.

2023

 

The Development of a 2-Dimensional, Sparse-Sensing, Any-Time and Any-Space SLAM System Using Low-Cost Robotics Platforms
Sergios Gavriilidis
BSc 2023

ABSTRACT CODE

This dissertation presents the development of a sparse-sensing, any-time and any-space Simultaneous Localization and Mapping (SLAM) system for low-cost robotics platforms, using the LEGO Mindstorms EV3 robotics kit. The majority of modern solutions to the SLAM problem utilize dense, accurate laser range-finders, while few works have attempted to address the SLAM problem using sparse, inaccurate sensors that are available in low-cost robotics platforms. We further the research into the sparse-sensing SLAM problem in the context of low-cost platforms, by developing a system that embeds a state-of-the-art sparse-sensing SLAM algorithm on-board resource-constrained platforms, in contrast to traditional methods which rely on high-performing modern computers for processing. We are the first to present the successful embedding of the SparseGSLAM algorithm in the context of low-cost robotics, reducing memory requirements by 20% in the process. Following previous work in the calibration of ultrasonic sensors using piecewise linear regression, we demonstrate an 89.86% decrease in the mean square error for the EV3 ultrasonic sensor for horizontally positioned objects. After a qualitative assessment of the produced map and trajectory estimates, we demonstrate the system’s ability to produce reliable localization and mapping results, despite the use of a low-cost ultrasonic sensor, providing evidence of the algorithm’s robustness under different sensor configurations. Finally, being the first work to use SparseGSLAM in an academic setting, we give valuable insight into the successful application of the algorithm in novel sensor and environment configurations, providing extensive documentation into the hyperparameter tuning process which serves as a guide for future work.

 

An Evaluation of Semantic Segmentation Techniques in an Indoor Scene
Luke Cummins
BSc 2023

ABSTRACT

Semantic segmentation is the task of assigning a pixel level classification to an image, denoting the semantic class each pixel belongs to. While this field has been dominated by Convolutional Neural Networks (CNNs), recent advancements have ushered in a new era of transformer-based approaches that demonstrate state-of-the-art performance. This dissertation presents an evaluation of the current transformer and CNN based models of SegNeXt, Swin, and the Vision Transformer (ViT) on a subset of the InteriorNet dataset. Drawing on previous research in the field, this study conducts both qualitative and quantitative analyses on the results of the evaluation and provides recommendations outlining the strengths and limitations of these models, critically assessing the suitability of each model for tasks within the indoor scene. Additionally, this study contributes to the academic literature of semantic segmentation by conducting a literature survey that outlines the segmentation problem and its use cases within indoor scenes. The survey then discusses the limitations of both transformer- and CNN-based approaches, and evaluates the metrics used to examine semantic segmentation models. Furthermore, this paper proposes a novel contribution to the semantic segmentation field in the form of the SwinBurger, a hybrid of the Swin Transformer and the Hamburger decode-head used in the SegNeXt model. An evaluation of this model demonstrates its across-the-board improvement, offering an increase of 1.8 mIoU and a 19% reduction in training duration across a 50,000 iteration fine-tuning period. Both qualitative and quantitative analyses were used to assess the performance of the SwinBurger architecture. Finally, this study presents a set of research questions for future investigation into the SwinBurger model, with the aim of achieving a greater understanding of the currently popular transformer-based methods. Overall, this dissertation advances the existing knowledge of semantic segmentation models and proposes a novel hybrid approach with promising results for future research.

2022

 

Fine-scale Synthetic Shape Generation with Upsampling Network
Jihoon Kim
MSc 2021-2022

ABSTRACT

Thanks to the introduction of deep learning-based shape generation methods, it is now possible to generate completely new, creative shapes based on existing shape models. However, due to the limited computational resources and difficulties in training the generative methods in high dimensions and resolutions, synthesising a good quality fine-scale 3D shape remains challenging. This project aims to enable fine-scale synthetic 3D shape generation by proposing a novel fine-scale shape generation framework consisting of the low-resolution 3D-GAN and the 3D up-convolution-based upsampling method. The main contribution of the project is that the proposed framework allows the generation of 3D shape samples with considerable synthetic shape quality compared to the native high-resolution 3D-GAN while being easier to train and requiring less computational power. It was shown that upsampling low-resolution synthetic shapes generated by 3D-GAN is a viable option to produce fine-scale synthetic shape samples and is effective, especially for relatively small objects (e.g., chairs, sofas or lamps) than large objects (e.g., cars or airplanes).

 

Detection and Re-identification System for Luggage and Person in the Security Checkpoint Based on YOLOv5+StrongSORT Framework
Junyi Ye
MSc 2021-2022

ABSTRACT

In public safety, luggage recognition has a high market value. To address the issue of non-fisheye cameras capturing continuously passing targets, this paper proposes a statistical system for baggage weight recognition that includes baggage-person detection, tracking, and matching, with the proposed You-Only-Look-Once-XLarge (YOLOv5-X) network serving as a benchmark for baggage-person detection. The proposed Rectangular with StrongSORT algorithm with shunt sifter module (StrongSORT-LP) tracks baggtracksimprove detection efficiency and accuracy even further, Focal-and-Efficient-Intersection-over-Union (EIoU) is used as the loss function of YOLOv5-X. On the COCO dataset, the proposed mAP for YOLOv5-H can achieve 68.9%. The statistical system has a low error rate of 4.9% on user-supplied data while maintaining a frame rate of 0.6 FPS, allowing it to meet the requirements of offline surveillance data analysis and has promising application prospects.

 

The Development & Evaluation of a Reinforced Learning Agent Capable of Playing Monopoly
Freddie Jonas
BSc 2022

ABSTRACT

Reinforcement learning (RL) is an area of machine learning where agents learn how to optimise action choices in order to maximise a reward function. It is a relatively new area and has shown great success being applied to both simple and complex problem environments, including robot control (Zhang & Mo, 2021) and computer/board games (Kaur & Gourav, 2020). This project aims to create a RL agent and compare the performance against human players. Secondary aims include researching whether it is possible to implement agents using consumer-grade hardware to evaluate the need for large high-powered computing systems in the creation of a RL agents; evaluating the performance of the RL agent against other baseline agents; and investigating the agent’s behaviour in the environment to assess whether new strategies and tactics are found. The dissertation successfully implemented a working RL agent utilising the Monte Carlo tree search algorithm in a Monopoly board game environment. The collected results show that the agent outperforms human players in the implemented environment, although due to the environment’s simplicity, further research is required in more complex domains. An analysis of the agent’s behaviour and performance against baseline agents is provided, with the agent demonstrating intelligent action choice. The dissertation concludes by detailing the achievements and evaluates the impact of the project’s limitations. Recommendations on how to fix these limitations are given, and topics for future work are described.

2021

 

Effect of occlusion filling in synthesised training data for stereo matching network
Jie He
MComp 2021

ABSTRACT CODE

In this paper, we proposed a lightweight stereo matching network to solve the disparity estimation problem from a pair of stereo images. Our network design was able to run on average 45-60 frames per second on HD resolution stereo images with a low GPU memory consumption of 0.8 GB. We evaluated the network performance on KITTI and Middlebury datasets and achieved a mean absolute error per pixel of 1.85px and 6.02px respectively. Moreover, we adopted a stereo image generation pipeline that produces a stereo image by given a single image and the relative depth map. Through modifications, we deployed two additional methods of occlusion filling on the synthesised stereo image. The first method of inpainting is a classical interpolation method and the latter is a generative inpainting network. From the results of our tests, we discovered that any attempt to recreate the image texture in the occlusion region helps the training process of the stereo network. We find that our stereo network converged faster; the error rates over the training stages also have less fluctuation when using our proposed occlusion inpainting methods to treat the synthesised training data.

 

End-to-end Autonomous Racing using Deep Reinforcement Learning
Marios Pastos
MSc 2020-2021

ABSTRACT

With the recent progress in the domain of Deep Reinforcement Learning (DRL), new opportunities emerge for solving complex sequential decision-making tasks. This work presents an end-to-end DRL-based approach for autonomous driving control that maps observations straight into decisions, applied to a racing car simulation environment (TORCS). A series of experiments are conducted, by implementing and applying the Deep Deterministic Policy Gradient (DDPG), Soft Actor Critic (SAC) and Proximal Policy Optimisation (PPO) DRL algorithms. Two different autonomous control scenarios are investigated. In the first scenario, agents are evaluated on the task of autonomous steering control. In the second scenario, the agents are given full control of the vehicle (steering, throttle, and brakes). The results demonstrate that all three DRL agents are able to learn effective control policies in both scenarios. To improve performance, reward shaping and hyperparameter tuning experiments are performed. Reward shaping demonstrates improved lap times and driving stability. Hyperparameter tuning produced mixed results in terms of cumulative reward per episode. Finally, the SAC agent exhibited generalisation abilities in ‘unseen’ race tracks.

 

Simulation of UAV guided SLAM and Autonomous Racing System
Yu Zheng
MSc 2020-2021

ABSTRACT

In many cases, autonomous racing cars handle Perception and SLAM by using a fusion of the cameras and LIDAR on the car, but some shortages cannot handle. This project tries to provide a different idea that our robot will be running with the guidance from a drone(UAV) from the sky, which provides a different angle to increase efficiency. As a consequence, UAV provides a much wider field of view, a much flexible and much easier way to handle the SLAM and path planning algorithm. Besides the algorithms, as a completed simulation system, this project will provide the whole project components including Unreal Engine 4 environment development, windows application development, SQL database, OpenGL, OPEN CV, darknet, and others. The key point of this dissertation is that we create a new idea of using UAV as a scout, handling SLAM and path planning algorithm on our own, and finished the task successfully.

 

From Data to Prediction: a Real-time 3D Traffic Cone Detection System for High-speed Vehicle
Zilin Zhang
MSc 2020-2021

ABSTRACT

For high-speed vehicles, how to get the position and pose of target objects in an accurate and fast way is a challenging task since the vehicle perhaps travelled several metres in the forward direction while obtaining and processing environmental data. Meanwhile, if accuracy is too worst, it would influence the whole system’s performance and possibly cause an accident. Therefore, an accurate and sensitive perception system is vital for a robust and safe autonomous driving system. For perception, scenes in autonomous driving are variable, with a wide range of objects to identify and localise, such as pedestrians, vehicles, and traffic lights. Moreover, a traffic cone is often used in road closures, construction, and redirection in plenty of target objects. In these situations, vehicles are usually in a high-speed moving condition. Therefore, it is essential for vehicles to locate traffic cones accurately and fast on the road. However, to the best of our knowledge, little research is working on this subtask and has some shortage as follows i) Lack of public dataset for cone detection. For example, as a popular public dataset in autonomous driving, KITTI does not have a separate category for traffic cones. ii) Some methods used prior information, such as the shape of a specific traffic cone. This led to difficulty applying the same method in different scenes because cones have different shapes. iii) Existing method only recognise all cones as one single category and cannot further distinguish them between different shapes and colours. However, different colours may represent different meanings in some scenes, like red for stop and green for starting. To fill these gaps, this dissertation proposed i) a data generator based on Airsim to generate KITTI-like data for cone detection. ii) a pipeline based on CenterPoint to detect traffic cones. iii) a fusion method using image and pseudo-LiDAR to detect different traffic cones in one frame further.

2020

 

Perception Module for Autonomous Formula Student Vehicle
Philip A. Lorimer
MSc 2019-2020

ABSTRACT

A Perceptual Module was required for a Formula Student Vehicle, thus a resolution to this problem was explored. For a successful solution, the method aimed to utilise multiple sensing channels in order to detect and classify objects typically found in the racing field. To address this a plethora of sensing channels and previously used approaches for object detection were explored. A method was proposed which utilises two sensing channels, Vision and LiDAR. The Visual channel samples images from a Monocular camera, and experiments were conducted comparing the use of YOLOv4 detection network, versus colour-based computer vision image processing techniques. The LiDAR channel utilised a Velodyne VLP-16 LiDAR to detect objects within a three-dimensional space, the sampled data was operated on to isolate the objects within the point cloud. The experiments conducted compared the resulting point cloud of adjusting the parameters for up-sampling via the use of statistical outlier filtering and voxel grid dilation. The results to a certain degree demonstrate the modules ability to fulfil the basic requirements, however limitations are discussed, suggesting further improvements for the methods. Further work is addressed suggesting the exploration of more complex methods, and adaptive approaches for the parametric based methods.

 

Weakly Supervised Semantic Segmentation based on Principal Protoype Features Discovering
Jiawei Feng
MSc 2019-2020

ABSTRACT

Weakly supervised semantic segmentation methods rely on image-level labels to generate proxy segmentation masks, and further train the segmentation network on these masks with various constraints. However the performance of current segmentation methods is often limited by the quality of proxy annotations generated by the classification-based localization maps. In order to produce high-quality proxy annotations, in this paper we investigate two novel mechanisms (1) generating high-quality proxy annotations via an unsupervised principal prototype features discovering (PPFD) strategy, and (2) designing an effective annotation selection and refinement (ASR) strategy for improving annotations’ quality. The PPFD localizes semantic pixels by means of principal feature analysis on support images. The resulting attention maps are utilized to generate proxy annotations for weakly supervised segmentation. In addition, ASR is employed to recognize and refine the low-quality proxy annotations by self-judgement based mask scoring and discriminative regions mining. The entire framework, called PPFD+ASR, significantly advances the state-of-the-art on the test set of PASCAL VOC 2012 segmentation benchmark, improving OOA + by 3.6% and 3.3% (66.4% vs 62.8% and 69.7% vs 66.4%) by taking VGG16 and ResNet101 as the baselines respectively.

 

A More Robust Loop Closure Algorithm Based on Learning in Direct SLAM
Sa Wu
MSc 2019-2020

ABSTRACT CODE

Closed-loop detection is an essential part of the SLAM system and plays a significant role in eliminating accumulated errors. In this paper, in order to improve some closed-loop detection problems in small scenes of LSD-SLAM system, which is the most representative Direct Method, we introduce a closed-loop detection algorithm based on deep learning. This more robust algorithm is unsupervised learning based on neural network architecture, and the model trained by it can accurately extract the features of appearance changes directly from the original image without extracting feature points or training for specific environments. We use the advantages of this algorithm to integrate it into LSD-SLAM system. After that, we compare the similarity between frames, and the results show that the higher the similarity, the more likely it is that there is a loop here. Compared with the original system, our method shows higher accuracy on the real-time small scene test dataset.

 

Racing Line Optimisation for Autonomous Vehicles
Joe Davison
BSc 2020

ABSTRACT PROJECT PDF CODE

This project develops a method for generating racing lines, that aims to minimise both lap time and run time. By reducing run time, the use of such methods in a ‘live’ path planning environment becomes feasible. On-track performance and time to compute a path are often competing objectives, so there is a particular focus on reducing run time for a small trade-off in lap time. Existing methods in path planning and vehicle dynamics have been adapted for a specific path planning problem. In particular, a compromise between minimising path length and path curvature is used to optimise a racing line. This working model is then used to investigate two key improvements to the method. A sector-based approach was developed, in which paths through individual corner sequences are optimised and later merged. By performing the individual optimisations in parallel a 50% reduction in run time was achieved. Due to the merging of adjacent sector paths, the full racing line saw lap times increase up to 3%. The compromise between path length and curvature included a secondary optimisation of the weight between these two metrics. Significant improvements in run time were made by instead modelling track properties to estimate a well-performing compromise weight. This reduced the cost of the optimisation to that of a basic minimum curvature problem, but with lap times closer to the use of an optimal compromise than that of a minimum curvature path.

 

Sensor-based Cone Classification and Position Estimation with Machine Learning
Jie He
BSc 2020

ABSTRACT

The award-winning Team Bath Racing Electric - AI (TBRe-AI) is in search of contribution to their autonomous pipeline for their electric vehicle to embrace the summer FSUK competition 2020. To express interest, this dissertation mainly focuses on exploring the state of the art methods to produce the Perception Pipeline. Which is aimed to detect the landmarking features such as traffic cones. Also, an estimate location information will be computed for each cone detected. Due to the nature of the problem, the topic of developing the pipeline will be surrounding the required hardware Stereo camera and LiDAR. This dissertation will explain the capabilities of each hardware to formalise a design for the perception pipeline. Followed by the implementation, testing and evaluation stage of the proposed system.

2019

 

Design of a Testing Platform and Localisation and Mapping System for the Formula Student Artifical Intelligence Competition
Donal McLaughlin
MComp 2019

ABSTRACT

The aims of this project are to design a testing platform for the Formula Student Artificial Intelligence event and use this platform to examine, test and evaluate the potential approaches which could be taken to perform accurate localisation and mapping in the competition. The contributions of this project are two-fold. Firstly, it presents a low-cost, customisable esting platform - based on a Husarion ROSbot and NVIDIA Jetson Nano - which can be used to test the systems required for the FS-AI event. Secondly, it presents a simultaneous localisation and mapping (SLAM) framework that is suitable for the FS-AI event. The platform was tested in an environment as similar to the FS-AI environment as possible to ensure valid results. This project’s work will be used as the building blocks for the final car designed to compete in the FS-AI 2019 event. It is also hoped that it will be used by members of TBRe in the future, as both an introduction to the competition and as a guide for the software and hardware tools they may require.

 

Optimising Facial Information Extraction and Processing using CNNs
Eklavya Sarkar
MSc 2018-2019

ABSTRACT

Recent years have generated many remarkable large deep convolutional neural networks, such as AlexNet, GoogLeNet, and ResNet, which have the very ability to detect objects in natural images. However, we still have limited intuitions about how these networks work, even though some notable visualization tools used to reveal the partial veils of these deep neural networks (DNNs) like deconvolution networks provided by in 2013 and salience map developed by in 2014. In this paper, combined with these visualization methods, we introduce a novel simple explainable learning framework, which is able to provide straightforward explanations on how our networks provide classification results. We implement our learning framework on the InteriorNet datasets to classify interior decoration styles, which are more abstract than those natural objects categories detected in the ImageNet datasets. Also, we present how our framework has advantages over AlexNet in extracting some specific features in this style classification problem.

 

Investigating The Relationship Between Shot Distance & Shot Mechanics In Basketball
Lavy Friedman
MSc 2018-2019

ABSTRACT

Recent years have generated many remarkable large deep convolutional neural networks, such as AlexNet, GoogLeNet, and ResNet, which have the very ability to detect objects in natural images. However, we still have limited intuitions about how these networks work, even though some notable visualization tools used to reveal the partial veils of these deep neural networks (DNNs) like deconvolution networks provided by in 2013 and salience map developed by in 2014. In this paper, combined with these visualization methods, we introduce a novel simple explainable learning framework, which is able to provide straightforward explanations on how our networks provide classification results. We implement our learning framework on the InteriorNet datasets to classify interior decoration styles, which are more abstract than those natural objects categories detected in the ImageNet datasets. Also, we present how our framework has advantages over AlexNet in extracting some specific features in this style classification problem.

 

A Simple Explainable Learning Framework for Indoor Scene Understanding
Ningchao Wang
MSc 2018-2019

ABSTRACT

Recent years have generated many remarkable large deep convolutional neural networks, such as AlexNet, GoogLeNet, and ResNet, which have the very ability to detect objects in natural images. However, we still have limited intuitions about how these networks work, even though some notable visualization tools used to reveal the partial veils of these deep neural networks (DNNs) like deconvolution networks provided by in 2013 and salience map developed by in 2014. In this paper, combined with these visualization methods, we introduce a novel simple explainable learning framework, which is able to provide straightforward explanations on how our networks provide classification results. We implement our learning framework on the InteriorNet datasets to classify interior decoration styles, which are more abstract than those natural objects categories detected in the ImageNet datasets. Also, we present how our framework has advantages over AlexNet in extracting some specific features in this style classification problem.