SQL Injection (SQLi) Detection using Machine Learning

Thomson, A. (2016). SQL Injection (SQLi) Detection using Machine Learning (BEng (Hons) CSF Dissertation). Edinburgh Napier University (Macfarlane, R., Arbelet, A.).



Over the past few years the internet has become extensively used where organisations are taking advantage of this by providing a lot of their services online now, with the growing use of web applications such as these there is an ever growing threat landscape. As system become more complex, an increased amount of vulnerabilities are being found within these systems however as the systems become more complex attackers have created more sophisticated attacks. Therefore traditional methods of Web Application Defence such as IDS/IPS and WAF are becoming less effective.
Machine Learning (ML) is becoming a more widely researched area and it providing results which suggest that it may make an adequate replacement to these traditional methods which are becoming less useful. However ML is only an effective solution where the algorithm has been trained using an appropriate model containing the correct set of features however these areas haven’t been explored properly in current research.
This project aims to investigate areas such as these primarily focusing on feature selection while finding an appropriate dataset to carry out research on. While trying to achieve this aim, three experiments where designed and implemented which look at evaluating various algorithms used for SQL Injection Detection, evaluating the differences between automatic and manual feature selection processes and finally trying to evaluate whether some features have greater significance than others in feature sets.
From these experiments it was observed that Decision Tree Algorithms are by far more superior to some of the other algorithms used widely in research namely CART and Random Forrest. The results from the second experiment illustrated that when extracting features from a dataset that having expert knowledge around the domain for this process provides significantly more accurate results. From the final experiment it was highlighted that features play a crucial role in ML since certain features have more impact on the performance than others, therefore feature sets can be broken down into attack classes for greater accuracy and reduced cost in terms of time and resources.
[Read More]


Areas of Expertise

Electronic information now plays a vital role in almost every aspect of our daily lives. So the need for a secure and trustworthy online infrastructure is more important than ever. without it, not only the growth of the internet but our personal interactions and the economy itself could be at risk.

Associated Projects

    Keywords: cyber security