Novel mixed-encoding for forecasting patent grant duration.

Abstract

In an age when data is regarded as the most essential commodity, organizations are racing to use it for better decision making. The quality of the patent portfolio is an important indicator of technological innovation in an organization and its analysis can reveal several indicators linked to the growth of a company. The advancement of machine learning along with the access to large amounts of patent data has led to a paradigm shift from traditional patent data analysis methodologies to novel approaches. A lot of research has been done in this direction for analysing data on patent citations, patent text, IPC class etc. However, much less has been explored regarding the forecast of patent grant duration and its significance for decision making with an even lower focus on data collected from developing countries. This work is built upon our existing study on patent grant duration prediction by devising a novel methodology of encoding the data using a combination of augmented one-hot encoding and label-encoding. Thereafter, methodologies such as Outlier Detection have been applied to this data to yield an improved result vis-à-vis our baseline results. In addition, we identify some of the important factors which impact the decision on grant duration of patent applications using the raw data from the Indian Patent Office.

Publication
In World Patent Information, Elseviers
Prakhar Rathi
Prakhar Rathi
Data Scientist

I am a data scientist with a passion for creating innovative solutions to complex problems.