What is “cloud computing”?What is “cloud computing”? Is it a technology, a business model, industry hype? In the popular press, one can easily find many different definitions and opinions.
We can get a useful answer from Wikipedia. It divides cloud computing into three categories:
- Infrastructure as a Service (IaaS). This is renting the use of computers and storage on a pay as you go basis.
- Platform as a Service (PaaS). This is IaaS but with system software included.
- Software as a Service (SaaS). This is renting application software on a pay as you go or subscription basis.
These categories are helpful, but when we look at all the businesses claiming to provide “cloud computing” solutions, they falls short. Can we look at the Intellectual Property landscape for cloud computing and come up with a more useful answer?
Defining “cloud computing” by looking at 3,961 Patents and ApplicationsI have used Treparel’s KMX patent analytics tool to search the IFI CLAIMS Patent Services Claims Direct patent database to try and answer that question. Searching for the term “cloud computing” in all US applications and grants yields 3,961 documents. Reading all of these documents is impractical. Can we identify the key concepts without spending many hours reading?
Classifying “Cloud Computing” Patents Using IPC CodesPatent analysts commonly subdivide large collections using the US or International (IPC) patent classification coding schemes. These codes are assigned to patents and applications by the patent offices and are useful for precision searching. When looking at the first IPC code that appears in each document, there are 78 four character subclass codes in the collection of 3,961 documents. Here are the top 10 IPC Subclasses.
|IPC Subclass||Subclass Definition||# Documents|
|A63F||Sports; Games; Amusements – Card, Board or Roulette Games||34|
|G01C||Measuring; Testing – Measuring Distances||54|
|G06F||Computing; Calculating; Counting – Electric Digital Data Processing||2542|
|G06K||Computing; Calculating; Counting – Recognition & Presentation of Data||74|
|G06Q||Computing; Calculating; Counting – Data Processing Systems or Methods, Specially Adapted for Administrative, Commercial, Financial, Managerial, Supervisory or Forecasting Purposes||467|
|G06T||Computing; Calculating; Counting – Image Data Processing||30|
|H04L||Electric Communication Technique – Transmission of Digital Information||203|
|H04M||Electric Communication Technique – Telephonic Communication||30|
|H04N||Electric Communication Technique – Pictorial Communication||84|
|H04W||Eectric Communication Technique – Wireless Communication Networks||74|
The graph below shows that the “G06F - Digital Data Processing” subclass dominates, with the software application focused “G06Q - Data Processing Systems” subclass a distant second.
I can drill deeper into the IPC codes, but the data quickly fragments and becomes very difficult to summarize. It is interesting that patents around gaming (A63F) and navigation (G01C) appear in the top 10 list – these are not what I would normally associate with “cloud computing”.
While this is helpful, IPC codes do not always match the technology or business terminology that business or legal analysts commonly use. It would great if I could classify a large collection of patents using a custom classification scheme designed specifically for a particular engagement or client.
Using the KMX ClassifierWith KMX, you can do just that! KMX can take the collection of 3,961 documents, cluster them based on similar content, present a landscape visualization, and allow me to create my own custom classification system.
Here is what one section of the “cloud computing” patent landscape looks like, as rendered by KMX.
Areas of the landscape are color coded as shown below. The descriptions are taken from the labels generated automatically when I color code an area of the map.
With KMX, I can easily search the landscape and examine individual documents. I can quickly identify the following types of documents in the landscape:
Virtualization (Green) – Patents around using software to create virtual servers from a group real physical servers. For example, VM WARE has patents in this area. While virtualization is not “cloud computing”, it is an important enabling technology for building multi-tenant, scalable cloud infrastructure. These would fall into the Infrastructure as a Service “IaaS” category discussed above.
Cloud Services (Red) – Patents around cloud based infrastructure, but not specifically about virtualization. These would also fall into the general “IaaS” category.
Application Platforms (Blue) – Patents around supporting applications in a secure, multi-tenant environment. These could be classified as “Platform as a Service” (PaaS).
Operations (Yellow) – Patents around operating a cloud data center. There are a lot of these and they don’t fall neatly into either of the “IaaS”, “PaaS” or “SaaS” categories.
Applications & Content Storage (Green) – Patents around applications running in a cloud data center. While storage is a topic that appears in other places in the landscape (for example, the lower right), storing and managing content is an important part of delivering a cloud-based application. These are the “Software as a Service” (SaaS documents).
By looking at some individual documents, I can refine the categories. Specifically, I create seven categories that I will use to classify the 3,961 documents. The seven categories are: “Authentication”, “Business Apps”, “Cloud Storage”, “Location”, “Metering”, “Operations” and “Virtualization”. “Location” and “Metering” were chosen because there seem to be a lot of documents around those topics.
The classification process is simple once I have defined my classification scheme. Using the landscape map, the KMX search function, and a review of individual documents, I select a training set of documents. I identify about six documents in each area and label them using the interactive tool shown below. Alternatively, I can import a list of document IDs and labels in CSV format.
The table below shows two training document examples from each category. It s not difficult to find examples, once you have reviewed the landscape map.
|User Defined Classification||Sample document used for training||Document Title||Assignee|
|Authentication||US-20110302412-A1||PSEUDONYMOUS PUBLIC KEYS BASED AUTHENTICATION||NORTHWESTERN UNIVERSITY|
|Authentication||US-20110087888-A1||AUTHENTICATION USING A WEAK HASH OF USER CREDENTIALS|
|Business app||US-20100223113-A1||SYSTEMS FOR EMBEDDING ADVERTISEMENTS OFFERING AVAILABLE, DYNAMIC-CONTENT-RELEVANT DOMAIN NAMES IN ONLINE VIDEO||Go Daddy|
|Business app||US-20110040805-A1||TECHNIQUES FOR PARALLEL BUSINESS INTELLIGENCE EVALUATION AND MANAGEMENT||CREDIT SUISSE AG|
|Cloud Storage||US-20110055474-A1||DISPERSED STORAGE PROCESSING UNIT AND METHODS WITH GEOGRAPHICAL DIVERSITY FOR USE IN A DISPERSED STORAGE SYSTEM||CLEVERSAFE INC|
|Cloud Storage||US-20120079221-A1||System And Method For Providing Flexible Storage And Retrieval Of Snapshot Archives||AMAZON TECHNOLOGIES, INC.|
|Location||US-20110125395-A1||NAVIGATION SYSTEM WITH VEHICLE RETRIEVAL RESERVATION MECHANISM AND METHOD OF OPERATION THEREOF||Telenav Inc.|
|Location||US-20110213515-A1||PREDICTIVE MAPPING SYSTEM FOR ANGLERS||STRATEGIC FISHING SYSTEMS|
|Metering||US-20120116937-A1||Billing Usage in a Virtual Computing Infrastructure||NIMBULA, INC.|
|Operations||US-20120035773-A1||EFFICIENT COMPUTER COOLING METHODS AND APPARATUS||Powerquest, LLC|
|Operations||US-20110066796-A1||AUTONOMOUS SUBSYSTEM ARCHITECTURE||MICRON TECHNOLOGY, INC.|
|Virtualization||US-20110208908-A1||METHOD AND APPARATUS FOR HIGH AVAILABILITY (HA) PROTECTION OF A RUNNING VIRTUAL MACHINE (VM)||AVAYA, INC.|
|Virtualization||US-20090300149-A1||SYSTEMS AND METHODS FOR MANAGEMENT OF VIRTUAL APPLIANCES IN CLOUD-BASED NETWORK||RED HAT INC.|
With the training set in place, I can create a KMX free classifier. The result is a label attached to each of the 3,961 documents – each document is labeled based on its consistency with the training set. Consistency is determined by the state of the art KMX Support Vector Machine algorithm. Once training is complete, each document is assigned a label and is given a score each category. The score indicates how well the document matches the training examples in each category. The final result looks like this in KMX:
These results can easily be imported into Excel. Using Excel, I can look at the automatically generated classification. Here are some of the results. The document count in each category is shown below.
|User Defined Categories||Document Count|
The “Other” category is automatically generated and contains patents that do not confidently fit into one of the user defined categories.
I can also compare assignees (companies) across the categories I created. Here is a comparison of some top IT companies. The top assignee in each category is highlighted in Red. Other strong showings are highlighted in Yellow.
Now, this does not represent all of the patents that these assignees own – only those containing the term “cloud-computing”. VMWARE for example has 635 patents and application, but only 17 are included here. The KMX classifier we created can be saved and applied to other data sets. For example, we could create a data set of VMWARE’s 635 patents and apply this classifier to it.
We see that IBM (391) and Microsoft (349) dominate the landscape. That might be a surprise as these companies are not always mentioned as the top players in the “cloud computing” market. But, they are clearly seeking to protect their markets. Chicago based Cleversafe leads in the “Cloud Storage” category. Sunnyvale California based Telenav leads in the “Location” category. Red Hat has a strong portfolio and leads in “metering”. Again, these names are not the first to come to mind when thinking about “cloud computing”.