Nowadays, crimes carried out using digital equipment are common. In order to deal with these crimes, digital investigations are carried out by specialised practitioners. During investigations, a wide range of methodologies and tools are required. While most methodologies have been extensively developed, standardised, and validated, the tools used within them are not. Most investigations are heavily based on these tools and the quality of the results that they produce is essential. The purpose of this study was to investigate a solution to evaluate and validate digital forensics tools. To facilitate this, a cloud-based platform has been developed, to provide a way to evaluate digital forensic tools in an automated way. The platform is based on two main components. The first focuses on the creation of digital forensics images which can be used to evaluate tools against. The prototype platform enable testers to provide a range of requirements for the creation of the digital forensics images, including operating systems, user activities, and files which will be present into the digital forensics image. The image creation process is based on the work of Simson Garfinkel, who developed digital corpora which can be used to create custom digital forensics images. The second component is a system which can test digital forensics tools following a function-based methodology based on work by Guo and Slay. It uses images created by the first component to evaluate the digital forensics tools against.

A range of performance and quality metrics based on the literature have been used. A new tool evaluation metric has been developed called Digital Forensics True Positive, which allows the similarity of extracted evidence to be assessed. Furthermore, an overall quality score has been presented which can be used to define how well a digital forensic tool performs. This could be used to validate and compare digital forensics tools. The ability of the prototype platform to carry out the evaluation and validation of digital forensics tools has been successfully demonstrated by testing carving tools against a range of forensic images. The results of evaluation carried out using the platform, show that significant differences between carving tools exist. Scalpel had results of 69% true positives, against 99% for Foremost, showing a significant difference in the number of potential evidence files being carved.

The evaluation also highlighted that Foremost produced a large number of False Positives and could affect practitioners using the tool. All carving tools evaluated were successfully graded and were given ratings between good and almost perfect, using the overall quality score metric developed for the platform.
