Patent Classification and Analysis

Author/Creator

Author/Creator ORCID

Date

2019-01-01

Department

Computer Science and Electrical Engineering

Program

Computer Science

Citation of Original Publication

Rights

Distribution Rights granted to UMBC by the author.
Access limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan thorugh a local library, pending author/copyright holder's permission.
This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.

Abstract

In today's world, patents play an important role in helping inventors and organizations protect their intellectual property. With a rapid increase in the number of patents granted over the last 25 years, it has become important to create tools and methodologies that facilitate better understanding of this large corpus. This theses aims to classify patents by the assignee, the assignee being the company that owns the patent. A text classification approach is used. Six companies/organizations are chosen as assignees/owners of the patents, which are: Amazon Technologies, Apple Inc., Google Llc, International Business Corporation (IBM), Intel Corporation, Microsoft Corporation. Two machine learning models are trained for classification: Naive Bayes model and Neural Network model. Two experiments are performed, extracting only the abstract for the first one and extracting abstract and claims for the second one. Python scripts are used to download the patent documents, extract the data items of interest, pre-process the dataset and train and test the machine learning models. The results obtained are analysed and the performances of the classifiers are compared. The best performing model was the Neural Network implementation using Keras with an accuracy of 79.02%.