Using KMX to Classify the "Cloud Computing" Patent Landscape

  • Added
  • Author:

What is “cloud computing”?

What is “cloud computing”?  Is it a technology, a business model, industry hype?  In the popular press, one can easily find many different definitions and opinions. 

We can get a useful answer from Wikipedia.  It divides cloud computing into three categories:

-          Infrastructure as a Service (IaaS).  This is renting the use of computers and storage on a pay as you go basis.
-          Platform as a Service (PaaS).  This is IaaS but with system software included.
-          Software as a Service (SaaS).  This is renting application software on a pay as you go or subscription basis. 

These categories are helpful, but when we look at all the businesses claiming to provide “cloud computing” solutions, they falls short.  Can we look at the Intellectual Property landscape for cloud computing and come up with a more useful answer?

Defining “cloud computing” by looking at 3,961 Patents and Applications

I have used Treparel’s KMX patent analytics tool to search the IFI CLAIMS Patent Services Claims Direct patent database to try and answer that question.  Searching for the term “cloud computing” in all US applications and grants yields 3,961 documents. Reading all of these documents is impractical.  Can we identify the key concepts without spending many hours reading?

Classifying “Cloud Computing” Patents Using IPC Codes

Patent analysts commonly subdivide large collections using the US or International (IPC) patent classification coding schemes.   These codes are assigned to patents and applications by the patent offices and are useful for precision searching.  When looking at the first IPC code that appears in each document, there are 78 four character subclass codes in the collection of 3,961 documents.  Here are the top 10 IPC Subclasses. 
 
IPC Subclass Subclass Definition # Documents
A63F  Sports; Games; Amusements – Card, Board or Roulette Games 34
G01C  Measuring; Testing – Measuring Distances 54
G06F  Computing; Calculating; Counting – Electric Digital Data Processing 2542
G06K  Computing; Calculating; Counting – Recognition & Presentation of   Data 74
G06Q  Computing; Calculating; Counting – Data Processing Systems or Methods, Specially Adapted for Administrative, Commercial, Financial, Managerial, Supervisory or Forecasting Purposes  467
G06T  Computing; Calculating; Counting – Image Data Processing 30
H04L  Electric Communication Technique – Transmission of Digital Information 203
H04M  Electric Communication Technique – Telephonic Communication 30
H04N  Electric Communication Technique – Pictorial Communication 84
H04W  Eectric Communication Technique – Wireless Communication Networks 74
 
The graph below shows that the “G06F - Digital Data Processing” subclass dominates, with the software application focused “G06Q - Data Processing Systems” subclass a distant second. 

IPC Subclass for "cloud computing" patents and applications.

IPC Subclass for "cloud computing" patents and applications.

I can drill deeper into the IPC codes, but the data quickly fragments and becomes very difficult to summarize.  It is interesting that patents around gaming (A63F) and navigation (G01C) appear in the top 10 list – these are not what I would normally associate with “cloud computing”. 

While this is helpful, IPC codes do not always match the technology or business terminology that business or legal analysts commonly use.  It would great if I could classify a large collection of patents using a custom classification scheme designed specifically for a particular engagement or client. 

Using the KMX Classifier

With KMX, you can do just that!  KMX can take the collection of 3,961 documents, cluster them based on similar content, present a landscape visualization, and allow me to create my own custom classification system. 

Here is what one section of the “cloud computing” patent landscape looks like, as rendered by KMX. 

KMX Landscape Visualization of "cloud computing" patents and applications.

KMX Landscape Visualization of "cloud computing" patents and applications.

Areas of the landscape are color coded as shown below.  The descriptions are taken from the labels generated automatically when I color code an area of the map. 

KMX landscape color coding.

KMX landscape color coding.

With KMX, I can easily search the landscape and examine individual documents.  I can quickly identify the following types of documents in the landscape:

Virtualization (Green) – Patents around using software to create virtual servers from a group real physical servers.  For example, VM WARE has patents in this area.   While virtualization is not “cloud computing”, it is an important enabling technology for building multi-tenant, scalable cloud infrastructure.  These would fall into the Infrastructure as a Service “IaaS” category discussed above.

Cloud Services (Red) – Patents around cloud based infrastructure, but not specifically about virtualization.  These would also fall into the general “IaaS” category. 

Application Platforms (Blue) – Patents around supporting applications in a secure, multi-tenant environment.  These could be classified as “Platform as a Service” (PaaS). 

Operations (Yellow) – Patents around operating a cloud data center.  There are a lot of these and they don’t fall neatly into either of the “IaaS”, “PaaS” or “SaaS” categories. 

Applications & Content Storage (Green) – Patents around applications running in a cloud data center.    While storage is a topic that appears in other places in the landscape (for example, the lower right), storing and managing content is an important part of delivering a cloud-based application.  These are the “Software as a Service” (SaaS documents). 

By looking at some individual documents, I can refine the categories.  Specifically, I create seven categories that I will use to classify the 3,961 documents.  The seven categories are:  “Authentication”, “Business Apps”, “Cloud Storage”, “Location”, “Metering”, “Operations” and “Virtualization”.  “Location” and “Metering” were chosen because there seem to be a lot of documents around those topics. 

The classification process is simple once I have defined my classification scheme.  Using the landscape map, the KMX search function, and a review of individual documents, I select a training set of documents.  I identify about six documents in each area and label them using the interactive tool shown below.  Alternatively, I can import a list of document IDs and labels in CSV format. 

KMX training sets can be created through the user interface.

KMX training sets can be created through the user interface.


The table below shows two training document examples from each category.  It s not difficult to find examples, once you have reviewed the landscape map. 
 
User Defined Classification Sample document used for training Document Title Assignee
Authentication US-20110302412-A1 PSEUDONYMOUS PUBLIC KEYS BASED AUTHENTICATION NORTHWESTERN UNIVERSITY
Authentication US-20110087888-A1 AUTHENTICATION USING A WEAK HASH OF USER CREDENTIALS Google
Business app US-20100223113-A1 SYSTEMS FOR EMBEDDING ADVERTISEMENTS OFFERING AVAILABLE, DYNAMIC-CONTENT-RELEVANT DOMAIN NAMES IN ONLINE VIDEO Go Daddy
Business app US-20110040805-A1 TECHNIQUES FOR PARALLEL BUSINESS INTELLIGENCE EVALUATION AND MANAGEMENT CREDIT SUISSE AG
Cloud Storage US-20110055474-A1 DISPERSED STORAGE PROCESSING UNIT AND METHODS WITH GEOGRAPHICAL DIVERSITY FOR USE IN A DISPERSED STORAGE SYSTEM CLEVERSAFE INC
Cloud Storage US-20120079221-A1 System And Method For Providing Flexible Storage And Retrieval Of Snapshot Archives AMAZON TECHNOLOGIES, INC.
Location US-20110125395-A1 NAVIGATION SYSTEM WITH VEHICLE RETRIEVAL RESERVATION MECHANISM AND METHOD OF OPERATION THEREOF Telenav Inc.
Location US-20110213515-A1 PREDICTIVE MAPPING SYSTEM FOR ANGLERS STRATEGIC FISHING SYSTEMS
Metering US-20120089494-A1 Privacy-Preserving Metering Microsoft
Metering US-20120116937-A1 Billing Usage in a Virtual Computing Infrastructure NIMBULA, INC.
Operations US-20120035773-A1 EFFICIENT COMPUTER COOLING METHODS AND APPARATUS Powerquest, LLC
Operations US-20110066796-A1 AUTONOMOUS SUBSYSTEM ARCHITECTURE MICRON TECHNOLOGY, INC.
Virtualization US-20110208908-A1 METHOD AND APPARATUS FOR HIGH AVAILABILITY (HA) PROTECTION OF A RUNNING VIRTUAL MACHINE (VM) AVAYA, INC.
Virtualization US-20090300149-A1 SYSTEMS AND METHODS FOR MANAGEMENT OF VIRTUAL APPLIANCES IN CLOUD-BASED NETWORK RED HAT INC.
 
With the training set in place, I can create a KMX free classifier.  The result is a label attached to each of the 3,961 documents – each document is labeled based on its consistency with the training set.  Consistency is determined by the state of the art KMX Support Vector Machine algorithm.  Once training is complete, each document is assigned a label and is given a score each category.  The score indicates how well the document matches the training examples in each category.  The final result looks like this in KMX:

KMX automatically classifies all documents based on the examples contained in the training set.

KMX automatically classifies all documents based on the examples contained in the training set.

These results can easily be imported into Excel.  Using Excel, I can look at the automatically generated classification.  Here are some of the results.  The document count in each category is shown below. 
 
User Defined Categories Document Count
authentication 310
business app 1070
cloud storage 570
location 283
metering 92
operations 536
Other 713
virtualzation 387
Grand Total 3961

Summary of KMX Classification Results.

Summary of KMX Classification Results.

The “Other” category is automatically generated and contains patents that do not confidently fit into one of the user defined categories. 

I can also compare assignees (companies) across the categories I created.  Here is a comparison of some top IT companies.  The top assignee in each category is highlighted in Red.  Other strong showings are highlighted in Yellow

Summary of KMX Classification Results for Major Software Companies.

Summary of KMX Classification Results for Major Software Companies.


Now, this does not represent all of the patents that these assignees own – only those containing the term “cloud-computing”.  VMWARE for example has 635 patents and application, but only 17 are included here.  The KMX classifier we created can be saved and applied to other data sets.  For example, we could create a data set of VMWARE’s 635 patents and apply this classifier to it. 

We see that IBM (391) and Microsoft (349) dominate the landscape.  That might be a surprise as these companies are not always mentioned as the top players in the “cloud computing” market.  But, they are clearly seeking to protect their markets.  Chicago based Cleversafe leads in the “Cloud Storage” category.  Sunnyvale California based Telenav leads in the “Location” category.  Red Hat has a strong portfolio and leads in “metering”.  Again, these names are not the first to come to mind when thinking about “cloud computing”.

Conclusion

Using KMX, analysts can get unique insights into large patent collections by creating and applying their own classification system.  The process is fast and easy – producing better insight than relying on US or IPC classifications alone.  If you would like more information, please contact us at info@ificlaims.com.