Reseachers Apply Advanced Automated Capture and Analysis of Open Source Data to School Exam Analysis


News image


Over the past year, researchers at Edinburgh Napier University have been building one of the most extensive open source data fetchers, processing, analysis and visualisation infrastructures that you'll find in most universities across the World. This has integrated with research work on the early detection of illnesses and frailty.

Today, as a test, for example, they set their agents to watch for the GCSE 2016 results, and then analysed for various trends:

The system integrates innovative methods which learn the most trusted sites which contain the best and most trusted source of open data, and then automatically mine and parse the data containers such as for spreadsheets, PDFs and Word documents that are hosted on this site.

The extracted information is then sent back to evaluation agents who match the gathered data to the required criteria. These agents then use Python to analyse the results (using pandas), and produces charts using a Cloud service (Plotly).

In fact, the most time consuming element is actually writing the narrative that goes with the results.

Completely automated open source gatherer

The team have now conducted trials where they can specify an outline data problem, such as ones related to health and social care, and the innovative technology creates software agents which go and search the Cloud for the information, and then return it back for data agents to analyse and parse into the right format.

After this a range of mathematics are applied, and then automatically charted. The team think it is one of the fastest capture, analysis and chart systems around, and aim to analyse a whole range of health and social care information, in order to better analyse health care issues.


In order to test their novel system, they have been working on a number of test analysis problems, but all with the main focus on improvements in health and social care:

Prof Bill Buchanan outlines, "We want to build a fairer world built on evidence, and open source data provides us with the evidence that we need to improve health and social care in Scotland."

Fellow researcher, Adrian Smales, has been working with CM2000 on preventative health care systems, and says, "Increasingly health and social requires a deep analysis of data, especially to look for normality and anomalies, and this automated data gatherer allows us to understand what normality is, and thus detect the onset of illness. Our complete focus is always on improving their lives of citizens, an put in place care plans at an early stage."

Their system is one of the first to complete automate the gathering, analysis and charting of open source data. The work has been previously been applied to the automated generation of test questions from agents running in the Cloud:

[Read More]

Associated people

William Buchanan
Director of CDCS
+44 131 455 2759
Adrian Smales
Research Fellow
+44 131 455
Electronic information now plays a vital role in almost every aspect of our daily lives. So the need for a secure and trustworthy online infrastructure is more important than ever. without it, not only the growth of the internet but our personal interactions and the economy itself could be at risk.