Identification, Recovery and Verification of MP3 frames from Obfuscated Files

Flack, P. (2017). Identification, Recovery and Verification of MP3 frames from Obfuscated Files (BEng (Hons) CSF Dissertation). Edinburgh Napier University (Leimich, P., McKeown, S.).



Much of the audio piracy industry revolves around the MP3 music format, and these files can often prove to be crucial evidence in investigations. To this end, file obfuscation is a common technique used by criminals for anti-forensics, allowing criminals to hide potential evidence of piracy or other crimes, from investigators by making it appear innocuous. At present, no tools exist to allow investigators to identify these files.

The aim of this project was to develop and implement a methodology that will allow these obfuscated MP3 files to be identified no matter the file type they’ve been obfuscated to appear as, and the audio data from these files to then be recovered as playable media files. Expanding this methodology, the project automated this identification and recovery process, by creating a tool that reads the internal file format to find MP3 frames before carving them to a recovery file.

It was decided that to assist investigators, a process was needed for verifying the original MP3 that recovered frames had belonged to, to show that the file had been copied from elsewhere and that it was a copyrighted file. To do this Context Triggered Piecewise Hashing was implemented, to compare the recovered files against a database of hashed known MP3 files, to find comparison matches against that database in any recovered files. This automated recovery process was then rigorously tested against individual files of varying size and length, along with 5 USB images with different forms of obfuscation on the MP3 files present.

In conclusion, it was discovered that both the initial manual methodology and the automated process were successfully able to identify MP3 frames despite obfuscation and recover these frames as a playable media file, although the high amount of time needed to positively identify each frame header and calculate the length of each frame manually would be impossible to sustain in a real investigation, at which point the speed of the automated process becomes a great boon to investigators. Context triggered Piecewise Hashing was also capable of identifying the recovered files that matched in the database of known files with high consistency rates, though it was ultimately able to name an MP3 but not prove breach of copyright.
[Read More]


Paul Flack
Part-Time Demonstrator
+44 131 455

Areas of Expertise

Electronic information now plays a vital role in almost every aspect of our daily lives. So the need for a secure and trustworthy online infrastructure is more important than ever. without it, not only the growth of the internet but our personal interactions and the economy itself could be at risk.

Associated Projects

    Keywords: cyber crime