Solutions  >  Case Studies  >

Document Classification & Automated Legal Patent Rejection

Business Challenge

A company that provides a contract analysis service approached SFL Scientific about automating the process of analyzing a patent infringement claims and classifying it as “accepted” or “rejected”.

As the number of patent applications continue to rise over the years, so do the number of patent claims, diving deeper into technical details to further describe novel or adjacent processes. The patent holder gains a significant competitive advantage with a successful patent registration. Law firms representing businesses receive several million claims from patent holders asserting that their ownership of the patent has been infringed. Reading and processing patent infringement claims take tens of thousands of potential billable hours at significant cost for both enforcement and evaluation. It is in the best interest of firms to pursue the claims that are likely to result in a profit and set aside those that will not. Therefore, creating an algorithm that will accurately predict the outcome of legal patent infringement claims would save law firms millions of dollars in opportunity cost.

SFL Scientific Solution

Scanned hard copies of claims were delivered to SFL, so SFL needed to convert these scanned claims to digital format. Optical character recognition (OCR) technology converts these claims to digital format so that the text could be passed into machine learning models and analyzed. Inconsistencies and erroneous conversions are corrected manually before the text is analyzed.

Once the text is converted to digital format, methods such as bag-of-words model, term frequency-inverse document frequency (TF-IDF) and n-grams were used to create textual features for the documents. Each document (patent infringement claim) has a known label of either “accepted” (passed) or “rejected” (failed), so the textual features are then passed into a machine learning algorithm which is trained on the labeled documents. Certain types of intellectual property patents experience a disproportionate amount of litigation and these algorithms can identify them.


The resulting model is able to provide a highly accurate prediction for the probability of a success for any particular patent infringement claim. The claims were also ranked by likelihood of acceptance for SFL’s client so that they could prioritize high likelihood patent infringement claims. The algorithm also outperformed humans in predicting the success of claim.

The business value is substantial for the firm. The need for lawyers to read and process the intellectual property claims is reduced if not entirely eliminated. The firm gain tens of thousands of potential billable hours for their lawyers.


Contract Analysis Firm


Legal, Tech


Automate patient claim processing powered by NLP


Used NLP techniques in conjunction with supervised learning algorithms to extract key terms from patent claims and classify as approved or rejected

Tools & Technologies:

Python, NLP, LSTMs, AWS

 SFL Scientific is a AWS consulting partner for data science, big data, and artificial intelligence development..

About Us

DISCOVER MORE: SFL Scientific is a data science consulting firm offering custom development and solutions, helping companies enable, operate, and innovate using machine learning and predictive analytics. We accelerate the adoption of AI and deep learning and apply domain knowledge & industry expertise in solving complex, R&D, and novel business problems with data-driven systems.