Tony Trippe of Patinformatics LLC is writing an excellent series of technical blog posts discussing patent clustering, visualization and especially classification.
Part 1 of the projected three part series describes the difference between clustering and classification. The most popular technical approaches are discussed.
Part 2 describes binary classification and provides a detailed example based on fitness monitors. In his example, Tony uses CLAIMS Direct to access clean patent data and KMX to illustrate how to create a binary classifier.
Tony Trippe of Patinformatics Describes Patent Classification