You can find some typical degree projects/dissertations by the students who worked with us in the last few years. All rights of the documents and the related code and demonstrations from this page will be reserved by the original authors.
Suthiwat Umpornpaiboon BSc 2024 |
Simultaneous Localisation and Mapping (SLAM) is a fundamental technique in robotics that enables autonomous robots to explore and map unknown environments. While state-of-the-art SLAM methods often rely on advanced sensors like LiDAR and cameras, this project investigates the use of ultrasonic sensors for 2D SLAM in the context of educational robotics using the LEGO Mindstorms EV3 platform. The research addresses the challenges associated with ultrasonic-based SLAM, such as multipath issues, measurements at angled walls, and odometry drift, and proposes a robust and accurate SLAM system suitable for hands-on learning and experimentation.
The developed SLAM system combines frontier-based exploration, scan-matching localisation, and occupancy grid mapping to enable autonomous exploration and mapping of unknown environments. The experimental results demonstrate the effectiveness of the proposed approach, with the robot successfully mapping a kitchen environment and capturing the general layout and shape of the room. The study highlights the potential of ultrasonic sensors as a cost-effective and accessible solution for SLAM in educational and research robotics while also revealing limitations such as the impact of multipath issues on mapping accuracy and the constraints of relying on a single sensor modality.
This research contributes to the field of educational and research robotics by presenting a comprehensive study of ultrasonic-based 2D SLAM using the LEGO Mindstorms EV3 platform. The findings and insights gained from this work can serve as a foundation for further exploration and advancements in the field, ultimately promoting accessible and engaging ways to learn about and implement SLAM technologies.
Hannan Shah MSc 2023-2024 |
Sergios Gavriilidis BSc 2023 |
This dissertation presents the development of a sparse-sensing, any-time and any-space Simultaneous Localization and Mapping (SLAM) system for low-cost robotics platforms, using the LEGO Mindstorms EV3 robotics kit. The majority of modern solutions to the SLAM problem utilize dense, accurate laser range-finders, while few works have attempted to address the SLAM problem using sparse, inaccurate sensors that are available in low-cost robotics platforms. We further the research into the sparse-sensing SLAM problem in the context of low-cost platforms, by developing a system that embeds a state-of-the-art sparse-sensing SLAM algorithm on-board resource-constrained platforms, in contrast to traditional methods which rely on high-performing modern computers for processing. We are the first to present the successful embedding of the SparseGSLAM algorithm in the context of low-cost robotics, reducing memory requirements by 20% in the process. Following previous work in the calibration of ultrasonic sensors using piecewise linear regression, we demonstrate an 89.86% decrease in the mean square error for the EV3 ultrasonic sensor for horizontally positioned objects. After a qualitative assessment of the produced map and trajectory estimates, we demonstrate the system’s ability to produce reliable localization and mapping results, despite the use of a low-cost ultrasonic sensor, providing evidence of the algorithm’s robustness under different sensor configurations. Finally, being the first work to use SparseGSLAM in an academic setting, we give valuable insight into the successful application of the algorithm in novel sensor and environment configurations, providing extensive documentation into the hyperparameter tuning process which serves as a guide for future work.
Luke Cummins BSc 2023 |
Semantic segmentation is the task of assigning a pixel level classification to an image, denoting the semantic class each pixel belongs to. While this field has been dominated by Convolutional Neural Networks (CNNs), recent advancements have ushered in a new era of transformer-based approaches that demonstrate state-of-the-art performance. This dissertation presents an evaluation of the current transformer and CNN based models of SegNeXt, Swin, and the Vision Transformer (ViT) on a subset of the InteriorNet dataset. Drawing on previous research in the field, this study conducts both qualitative and quantitative analyses on the results of the evaluation and provides recommendations outlining the strengths and limitations of these models, critically assessing the suitability of each model for tasks within the indoor scene. Additionally, this study contributes to the academic literature of semantic segmentation by conducting a literature survey that outlines the segmentation problem and its use cases within indoor scenes. The survey then discusses the limitations of both transformer- and CNN-based approaches, and evaluates the metrics used to examine semantic segmentation models. Furthermore, this paper proposes a novel contribution to the semantic segmentation field in the form of the SwinBurger, a hybrid of the Swin Transformer and the Hamburger decode-head used in the SegNeXt model. An evaluation of this model demonstrates its across-the-board improvement, offering an increase of 1.8 mIoU and a 19% reduction in training duration across a 50,000 iteration fine-tuning period. Both qualitative and quantitative analyses were used to assess the performance of the SwinBurger architecture. Finally, this study presents a set of research questions for future investigation into the SwinBurger model, with the aim of achieving a greater understanding of the currently popular transformer-based methods. Overall, this dissertation advances the existing knowledge of semantic segmentation models and proposes a novel hybrid approach with promising results for future research.
Jihoon Kim MSc 2021-2022 |
Thanks to the introduction of deep learning-based shape generation methods, it is now possible to generate completely new, creative shapes based on existing shape models. However, due to the limited computational resources and difficulties in training the generative methods in high dimensions and resolutions, synthesising a good quality fine-scale 3D shape remains challenging. This project aims to enable fine-scale synthetic 3D shape generation by proposing a novel fine-scale shape generation framework consisting of the low-resolution 3D-GAN and the 3D up-convolution-based upsampling method. The main contribution of the project is that the proposed framework allows the generation of 3D shape samples with considerable synthetic shape quality compared to the native high-resolution 3D-GAN while being easier to train and requiring less computational power. It was shown that upsampling low-resolution synthetic shapes generated by 3D-GAN is a viable option to produce fine-scale synthetic shape samples and is effective, especially for relatively small objects (e.g., chairs, sofas or lamps) than large objects (e.g., cars or airplanes).
Junyi Ye MSc 2021-2022 |
In public safety, luggage recognition has a high market value. To address the issue of non-fisheye cameras capturing continuously passing targets, this paper proposes a statistical system for baggage weight recognition that includes baggage-person detection, tracking, and matching, with the proposed You-Only-Look-Once-XLarge (YOLOv5-X) network serving as a benchmark for baggage-person detection. The proposed Rectangular with StrongSORT algorithm with shunt sifter module (StrongSORT-LP) tracks baggtracksimprove detection efficiency and accuracy even further, Focal-and-Efficient-Intersection-over-Union (EIoU) is used as the loss function of YOLOv5-X. On the COCO dataset, the proposed mAP for YOLOv5-H can achieve 68.9%. The statistical system has a low error rate of 4.9% on user-supplied data while maintaining a frame rate of 0.6 FPS, allowing it to meet the requirements of offline surveillance data analysis and has promising application prospects.
Freddie Jonas BSc 2022 |
Reinforcement learning (RL) is an area of machine learning where agents learn how to optimise action choices in order to maximise a reward function. It is a relatively new area and has shown great success being applied to both simple and complex problem environments, including robot control (Zhang & Mo, 2021) and computer/board games (Kaur & Gourav, 2020). This project aims to create a RL agent and compare the performance against human players. Secondary aims include researching whether it is possible to implement agents using consumer-grade hardware to evaluate the need for large high-powered computing systems in the creation of a RL agents; evaluating the performance of the RL agent against other baseline agents; and investigating the agent’s behaviour in the environment to assess whether new strategies and tactics are found. The dissertation successfully implemented a working RL agent utilising the Monte Carlo tree search algorithm in a Monopoly board game environment. The collected results show that the agent outperforms human players in the implemented environment, although due to the environment’s simplicity, further research is required in more complex domains. An analysis of the agent’s behaviour and performance against baseline agents is provided, with the agent demonstrating intelligent action choice. The dissertation concludes by detailing the achievements and evaluates the impact of the project’s limitations. Recommendations on how to fix these limitations are given, and topics for future work are described.
Jie He MComp 2021 |
In this paper, we proposed a lightweight stereo matching network to solve the disparity estimation problem from a pair of stereo images. Our network design was able to run on average 45-60 frames per second on HD resolution stereo images with a low GPU memory consumption of 0.8 GB. We evaluated the network performance on KITTI and Middlebury datasets and achieved a mean absolute error per pixel of 1.85px and 6.02px respectively. Moreover, we adopted a stereo image generation pipeline that produces a stereo image by given a single image and the relative depth map. Through modifications, we deployed two additional methods of occlusion filling on the synthesised stereo image. The first method of inpainting is a classical interpolation method and the latter is a generative inpainting network. From the results of our tests, we discovered that any attempt to recreate the image texture in the occlusion region helps the training process of the stereo network. We find that our stereo network converged faster; the error rates over the training stages also have less fluctuation when using our proposed occlusion inpainting methods to treat the synthesised training data.
Marios Pastos MSc 2020-2021 |
With the recent progress in the domain of Deep Reinforcement Learning (DRL), new opportunities emerge for solving complex sequential decision-making tasks. This work presents an end-to-end DRL-based approach for autonomous driving control that maps observations straight into decisions, applied to a racing car simulation environment (TORCS). A series of experiments are conducted, by implementing and applying the Deep Deterministic Policy Gradient (DDPG), Soft Actor Critic (SAC) and Proximal Policy Optimisation (PPO) DRL algorithms. Two different autonomous control scenarios are investigated. In the first scenario, agents are evaluated on the task of autonomous steering control. In the second scenario, the agents are given full control of the vehicle (steering, throttle, and brakes). The results demonstrate that all three DRL agents are able to learn effective control policies in both scenarios. To improve performance, reward shaping and hyperparameter tuning experiments are performed. Reward shaping demonstrates improved lap times and driving stability. Hyperparameter tuning produced mixed results in terms of cumulative reward per episode. Finally, the SAC agent exhibited generalisation abilities in ‘unseen’ race tracks.
Yu Zheng MSc 2020-2021 |
In many cases, autonomous racing cars handle Perception and SLAM by using a fusion of the cameras and LIDAR on the car, but some shortages cannot handle. This project tries to provide a different idea that our robot will be running with the guidance from a drone(UAV) from the sky, which provides a different angle to increase efficiency. As a consequence, UAV provides a much wider field of view, a much flexible and much easier way to handle the SLAM and path planning algorithm. Besides the algorithms, as a completed simulation system, this project will provide the whole project components including Unreal Engine 4 environment development, windows application development, SQL database, OpenGL, OPEN CV, darknet, and others. The key point of this dissertation is that we create a new idea of using UAV as a scout, handling SLAM and path planning algorithm on our own, and finished the task successfully.
Zilin Zhang MSc 2020-2021 |
For high-speed vehicles, how to get the position and pose of target objects in an accurate and fast way is a challenging task since the vehicle perhaps travelled several metres in the forward direction while obtaining and processing environmental data. Meanwhile, if accuracy is too worst, it would influence the whole system’s performance and possibly cause an accident. Therefore, an accurate and sensitive perception system is vital for a robust and safe autonomous driving system. For perception, scenes in autonomous driving are variable, with a wide range of objects to identify and localise, such as pedestrians, vehicles, and traffic lights. Moreover, a traffic cone is often used in road closures, construction, and redirection in plenty of target objects. In these situations, vehicles are usually in a high-speed moving condition. Therefore, it is essential for vehicles to locate traffic cones accurately and fast on the road. However, to the best of our knowledge, little research is working on this subtask and has some shortage as follows i) Lack of public dataset for cone detection. For example, as a popular public dataset in autonomous driving, KITTI does not have a separate category for traffic cones. ii) Some methods used prior information, such as the shape of a specific traffic cone. This led to difficulty applying the same method in different scenes because cones have different shapes. iii) Existing method only recognise all cones as one single category and cannot further distinguish them between different shapes and colours. However, different colours may represent different meanings in some scenes, like red for stop and green for starting. To fill these gaps, this dissertation proposed i) a data generator based on Airsim to generate KITTI-like data for cone detection. ii) a pipeline based on CenterPoint to detect traffic cones. iii) a fusion method using image and pseudo-LiDAR to detect different traffic cones in one frame further.
Philip A. Lorimer MSc 2019-2020 |
A Perceptual Module was required for a Formula Student Vehicle, thus a resolution to this problem was explored. For a successful solution, the method aimed to utilise multiple sensing channels in order to detect and classify objects typically found in the racing field. To address this a plethora of sensing channels and previously used approaches for object detection were explored. A method was proposed which utilises two sensing channels, Vision and LiDAR. The Visual channel samples images from a Monocular camera, and experiments were conducted comparing the use of YOLOv4 detection network, versus colour-based computer vision image processing techniques. The LiDAR channel utilised a Velodyne VLP-16 LiDAR to detect objects within a three-dimensional space, the sampled data was operated on to isolate the objects within the point cloud. The experiments conducted compared the resulting point cloud of adjusting the parameters for up-sampling via the use of statistical outlier filtering and voxel grid dilation. The results to a certain degree demonstrate the modules ability to fulfil the basic requirements, however limitations are discussed, suggesting further improvements for the methods. Further work is addressed suggesting the exploration of more complex methods, and adaptive approaches for the parametric based methods.
Jiawei Feng MSc 2019-2020 |
Weakly supervised semantic segmentation methods rely on image-level labels to generate proxy segmentation masks, and further train the segmentation network on these masks with various constraints. However the performance of current segmentation methods is often limited by the quality of proxy annotations generated by the classification-based localization maps. In order to produce high-quality proxy annotations, in this paper we investigate two novel mechanisms (1) generating high-quality proxy annotations via an unsupervised principal prototype features discovering (PPFD) strategy, and (2) designing an effective annotation selection and refinement (ASR) strategy for improving annotations’ quality. The PPFD localizes semantic pixels by means of principal feature analysis on support images. The resulting attention maps are utilized to generate proxy annotations for weakly supervised segmentation. In addition, ASR is employed to recognize and refine the low-quality proxy annotations by self-judgement based mask scoring and discriminative regions mining. The entire framework, called PPFD+ASR, significantly advances the state-of-the-art on the test set of PASCAL VOC 2012 segmentation benchmark, improving OOA + by 3.6% and 3.3% (66.4% vs 62.8% and 69.7% vs 66.4%) by taking VGG16 and ResNet101 as the baselines respectively.
Sa Wu MSc 2019-2020 |
Closed-loop detection is an essential part of the SLAM system and plays a significant role in eliminating accumulated errors. In this paper, in order to improve some closed-loop detection problems in small scenes of LSD-SLAM system, which is the most representative Direct Method, we introduce a closed-loop detection algorithm based on deep learning. This more robust algorithm is unsupervised learning based on neural network architecture, and the model trained by it can accurately extract the features of appearance changes directly from the original image without extracting feature points or training for specific environments. We use the advantages of this algorithm to integrate it into LSD-SLAM system. After that, we compare the similarity between frames, and the results show that the higher the similarity, the more likely it is that there is a loop here. Compared with the original system, our method shows higher accuracy on the real-time small scene test dataset.
This project develops a method for generating racing lines, that aims to minimise both lap time and run time. By reducing run time, the use of such methods in a ‘live’ path planning environment becomes feasible. On-track performance and time to compute a path are often competing objectives, so there is a particular focus on reducing run time for a small trade-off in lap time. Existing methods in path planning and vehicle dynamics have been adapted for a specific path planning problem. In particular, a compromise between minimising path length and path curvature is used to optimise a racing line. This working model is then used to investigate two key improvements to the method. A sector-based approach was developed, in which paths through individual corner sequences are optimised and later merged. By performing the individual optimisations in parallel a 50% reduction in run time was achieved. Due to the merging of adjacent sector paths, the full racing line saw lap times increase up to 3%. The compromise between path length and curvature included a secondary optimisation of the weight between these two metrics. Significant improvements in run time were made by instead modelling track properties to estimate a well-performing compromise weight. This reduced the cost of the optimisation to that of a basic minimum curvature problem, but with lap times closer to the use of an optimal compromise than that of a minimum curvature path.
Jie He BSc 2020 |
The award-winning Team Bath Racing Electric - AI (TBRe-AI) is in search of contribution to their autonomous pipeline for their electric vehicle to embrace the summer FSUK competition 2020. To express interest, this dissertation mainly focuses on exploring the state of the art methods to produce the Perception Pipeline. Which is aimed to detect the landmarking features such as traffic cones. Also, an estimate location information will be computed for each cone detected. Due to the nature of the problem, the topic of developing the pipeline will be surrounding the required hardware Stereo camera and LiDAR. This dissertation will explain the capabilities of each hardware to formalise a design for the perception pipeline. Followed by the implementation, testing and evaluation stage of the proposed system.
Donal McLaughlin MComp 2019 |
The aims of this project are to design a testing platform for the Formula Student Artificial Intelligence event and use this platform to examine, test and evaluate the potential approaches which could be taken to perform accurate localisation and mapping in the competition. The contributions of this project are two-fold. Firstly, it presents a low-cost, customisable esting platform - based on a Husarion ROSbot and NVIDIA Jetson Nano - which can be used to test the systems required for the FS-AI event. Secondly, it presents a simultaneous localisation and mapping (SLAM) framework that is suitable for the FS-AI event. The platform was tested in an environment as similar to the FS-AI environment as possible to ensure valid results. This project’s work will be used as the building blocks for the final car designed to compete in the FS-AI 2019 event. It is also hoped that it will be used by members of TBRe in the future, as both an introduction to the competition and as a guide for the software and hardware tools they may require.
Eklavya Sarkar MSc 2018-2019 |
Recent years have generated many remarkable large deep convolutional neural networks, such as AlexNet, GoogLeNet, and ResNet, which have the very ability to detect objects in natural images. However, we still have limited intuitions about how these networks work, even though some notable visualization tools used to reveal the partial veils of these deep neural networks (DNNs) like deconvolution networks provided by in 2013 and salience map developed by in 2014. In this paper, combined with these visualization methods, we introduce a novel simple explainable learning framework, which is able to provide straightforward explanations on how our networks provide classification results. We implement our learning framework on the InteriorNet datasets to classify interior decoration styles, which are more abstract than those natural objects categories detected in the ImageNet datasets. Also, we present how our framework has advantages over AlexNet in extracting some specific features in this style classification problem.
Lavy Friedman MSc 2018-2019 |
Recent years have generated many remarkable large deep convolutional neural networks, such as AlexNet, GoogLeNet, and ResNet, which have the very ability to detect objects in natural images. However, we still have limited intuitions about how these networks work, even though some notable visualization tools used to reveal the partial veils of these deep neural networks (DNNs) like deconvolution networks provided by in 2013 and salience map developed by in 2014. In this paper, combined with these visualization methods, we introduce a novel simple explainable learning framework, which is able to provide straightforward explanations on how our networks provide classification results. We implement our learning framework on the InteriorNet datasets to classify interior decoration styles, which are more abstract than those natural objects categories detected in the ImageNet datasets. Also, we present how our framework has advantages over AlexNet in extracting some specific features in this style classification problem.
Ningchao Wang MSc 2018-2019 |
Recent years have generated many remarkable large deep convolutional neural networks, such as AlexNet, GoogLeNet, and ResNet, which have the very ability to detect objects in natural images. However, we still have limited intuitions about how these networks work, even though some notable visualization tools used to reveal the partial veils of these deep neural networks (DNNs) like deconvolution networks provided by in 2013 and salience map developed by in 2014. In this paper, combined with these visualization methods, we introduce a novel simple explainable learning framework, which is able to provide straightforward explanations on how our networks provide classification results. We implement our learning framework on the InteriorNet datasets to classify interior decoration styles, which are more abstract than those natural objects categories detected in the ImageNet datasets. Also, we present how our framework has advantages over AlexNet in extracting some specific features in this style classification problem.