skip to: main content | section menu | full site menu

Technology Transfer from the University of Oxford

AUTOMATED ANALYSIS OF TEXT - Isis Project No 2804

A sophisticated linguistic tool enables accurate and efficient analysis of text by understanding the grammatical structure of sentences.

MARKETING OPPORTUNITY

Growth in personal computing, affordable storage capacity and the Internet boom have fuelled an information explosion. With large and growing collections of digital media at our fingertips, the question arises how best to explore, learn from, and find information in these vast data repositories. Keyword query is the most widely used method for searching the web and other texts for information. However, the quality of keyword search is limited by the simplicity of the query itself, pervasive ambiguity in language, and the computer’s lack of knowledge of language and the world.   Development of more sophisticated linguistic tools is at the forefront of research in artificial intelligence and language processing. The Oxford technology represents a significant advance towards intelligent computational analysis of text, which could revolutionise search and lead to natural and more efficient methods for human-computer interaction.

THE OXFORD INVENTION

Parsing is the process of inferring the grammatical relations between words in a sentence. The Oxford researchers have developed a novel parser of naturally occurring text, which uses a sophisticated linguistic theory and exploits recent advances in Machine Learning.

The Oxford parser is:

  • Robust – does not break down when faced with novel sentence structures;
  • More linguistically advanced than existing robust parsers – uses a sophisticated grammar of English allowing recovery of long-range dependencies between words;
  • Efficient – 10x faster than comparable linguistically motivated parsers;
  • Accurate – recovers dependencies between words at over 85% accuracy on newspaper text.

The parser has been used to analyse 1 billion words of text in less than 5 days using only 18 processors. The Oxford invention opens the door for sophisticated text processing on an unprecedented scale, for use in information retrieval and extraction, text mining, and any application that requires automated understanding of large volumes of text.

COMMERCIAL OPPORTUNITY 


Isis would like to talk to companies interested in developing the commercial opportunity that this technology represents. Please contact the Isis Project Manager to discuss this further.

KEYWORDS

natural language processing, parsing, next-generation search engine, information retrieval, information extraction, semantic web

Request Further Information: Project Number 2804 Automated Analysis of Text.