Automated Document Tagging | Deeper Insights | Muscular Dystrophy UK

Auto-tagging and classifying research papers and articles that lets patients, researchers and staff quickly find relevant information.

Suzannah, Head of Digital at Muscular Dystrophy UK, and her team needed help with the organisation of their website content in preparation for a makeover. The project included 7,961 pages from their website which we calculated would have required 165 hours (20 days) for an employee to label, not to mention how long it took to find a single relevant document.

The goal was to develop a system to perform automatic tagging of their website content, organised into different sections (i.e. news, blog posts, etc), and classify these into disease categories, or condition types. This classification would help with their content audit and it would make it easier for their audience to find content that is up-to-date and relevant to their conditions.

The web pages from MDUK website were all tagged with high accuracy. Deeper Insights assigned two tags to each web page: the general condition tag (e.g., Spinal Muscular Atrophy) and the specific condition tag (e.g., Duchenne Muscular Dystrophy). The tagging was automatically performed by the system: a mix of ontology creation with fuzzy string matching. An evaluation of the results revealed that the tags assigned by our system were very accurate (+97%), saving many human hours of manual tagging, searching and filtering documents.

Muscular Dystrophy UK were very happy working with Deeper Insights on organising and classifying the content on the MDUK site which lays the foundation for their website redesign later in the year. They know that when people are diagnosed with a muscle-wasting condition, the website is the first place they turn to for support and so its very important to provide all the content they need to manage their condition.

Suzannah, Head Of Digital, Muscular Dystrophy UK