Apr 27, 2013

Relying on Keyword-based searches for patent retrieval - Think again

Keyword based searches are usually relied upon heavily for patent identification and retrieval.

While usually considered the easiest and most accurate way, these searches have some major flaws.

Flaws of multiple possible, even sometimes illogical, variants of words, untranslated words, words not correctly machine translated, words not correctly read during digitization process, etc. form the basis of considering additional strategies to be used along with only keyword-based searches.

Here's an example demonstrating one such limitation of keyword-based search

EA980444 - Method for adaptive kalman filtering in dynamic systems

It is a patent related to kalman filtering, of course. Navigate to the patent on espacenet, and you will be shocked. Its abstract has the word 'Batman' filtering. Is this because the patent has batman filtering? of course not. Its because of incorrect translation.

While its a general notion that such problems occur when the text is machine translated, its not entirely correct.
Take a look at espacenet page of US5394529, and notice that algorithm is spelled as 'aglorithm'. What's even more shocking is that the same spelling error is made in the original patent PDF.
On similar lines, this, quite new, US publication US2012130777 has vehicle spelled as 'vehical'.

There are many more examples which can only add up to this list but they are redundant.

Point is: Next time you search keep these patents in mind and beware of the 'dependency on the not-so-perfect keyword search'.