Study confirms strong performance of new patent analysis model

October 22, 2024

Editors' notes

This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

preprint

proofread

Study confirms strong performance of new patent analysis model

Study confirms strong performance of PaECTER model for patent analysis
Overview of the different models. Credit: arXiv (2024). DOI: 10.48550/arxiv.2402.19411

A study published by the National Bureau of Economic Research (NBER) has confirmed the strong performance of PaECTER, a patent analysis model developed by a team of researchers at the Max Planck Institute for Innovation and Competition. The model came out on top in a comparison with other models in tasks critical to patent examination and innovation research.

Developed by Mainak Ghosh, Sebastian Erhardt, Michael E. Rose, Erik Buunk, and Dietmar Harhoff, PaECTER (Patent-Level Representation Learning Using Citation-Informed Transformers) uses advanced transformer-based machine learning techniques fine-tuned with patent citation data.

The model is specifically designed to address the complex challenges of patent text analysis and provides significant improvements in the identification and categorization of similar patents, making it highly valuable for both patent examiners and innovation researchers.

The new NBER working paper "Patent Text and Long-Run Innovation Dynamics: The Critical Role of Model Selection" rigorously compares PaECTER with other Natural Language Processing (NLP) models.

The authors Ina Ganguli (University of Massachusetts Amherst), Jeffrey Lin (Federal Reserve Bank of Philadelphia), Vitaly Meursault (Federal Reserve Bank of Philadelphia), and Nicholas Reynolds (University of Essex) assessed the models' performances in patent interference tasks, where multiple inventors claim similar inventions.

The study concluded that PaECTER significantly reduces false positives and improves efficiency compared to traditional models like TF-IDF (Term Frequency—Inverse Document Frequency). The study also highlighted PaECTER's capabilities when compared with other modern models such as GTE and S-BERT (Generalized Text Embedding and Sentence-BERT as methods for representing texts in the form of numerical vectors that capture semantic information about words or entire sentences).

While PaECTER performed exceptionally well in expert-driven tasks like interference identification, it also held its own in broader patent classification tasks, further reinforcing its versatility.

"We are pleased that PaECTER's performance has been validated by the NBER study, which shows its strengths in patent similarity analysis and confirms its role as a reliable tool for those working in the field of innovation and intellectual property," says Mainak Ghosh, one of PaECTER's developers. "This independent validation further strengthens its relevance in the field of patent examination."

The PaECTER model is available for use on the Hugging Face platform, making it accessible to researchers, policymakers, and patent professionals worldwide. Its robust performance, as demonstrated by the NBER study, underscores its value in improving the way patent data is processed, contributing to more accurate and efficient analysis of patent innovations over time. As of today, PaECTER has been downloaded more than 1.4 million times.

More information: Ina Ganguli et al, Patent Text and Long-Run Innovation Dynamics: The Critical Role of Model Selection (2024). DOI: 10.3386/w32934

Mainak Ghosh et al, PaECTER: Patent-level Representation Learning using Citation-informed Transformers, arXiv (2024). DOI: 10.48550/arxiv.2402.19411

Journal information: arXiv Provided by Max-Planck-Institut für Innovation und Wettbewerb Citation: Study confirms strong performance of new patent analysis model (2024, October 22) retrieved 22 October 2024 from https://techxplore.com/news/2024-10-strong-patent-analysis.html This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Philosopher finds glitch in worldwide patent laws shares

Feedback to editors