By Kevin Gray, Marketing Science and Analytics
Machine learning gets a lot of buzz these days, usually in connection with big data and artificial intelligence (AI). But what exactly is it? Broadly speaking, machine learners are computer algorithms designed for pattern recognition, curve fitting, classification and clustering. The word learning in the term stems from the ability to learn from data. Machine learning is also widely used in data mining and predictive analytics, which some commentators loosely call big data. It also is used for consumer survey analytics and is not restricted to high-volume, high-velocity data or unstructured data and need not have any connection with AI.
In fact, many methods marketing researchers are well acquainted with, such as regression and k-means clustering, are also frequently called machine learners. For examples, see Apache Spark's machine learning library or the books I cite in the last section of this article. To keep things simple, I will refer to well-known statistical techniques like regression and factor analysis as older machine learners and methods such as artificial neural networks as newer machine learners since they are generally less familiar to marketing researchers.
Machine learning is used for many purposes such as in seismology, medical research, computer network security and human resource management. The following are some of the more common ways machine learners of any vintage are used in marketing:
- predicting how likely a customer is to buy a certain product;
- estimating how much a customer will spend in a product category;
- identifying relatively homogenous consumer groups – consumer segmentation;
- finding the key drivers (What service elements best predict customer satisfaction?);
- in marketing mix modeling (identifying marketing activities with the biggest payoff);
- for recommender systems (e.g., people who bought John Grisham also bought Scott Turow);
- for individually targeted ads; and
- in social media analytics.
Types of machine learners
There are literally hundreds of machine learners and many are used for multiple purposes. Some machine learners are extremely complex and others are ingeniously simple and they can be categorized in numerous ways. Here are a few examples:
- Supervised methods are used when there is a dependent variable. Regression and discriminant analysis are supervised methods. The dependent variable is often called a label by data scientists.
- Supervised methods are further subdivided by whether the label is a category, such as purchaser/non purchaser, or a quantity, such as amount spent. Discriminant analysis is appropriate in the first case, which statisticians call a classification problem, and regression analysis in the second, known as a regression problem.
- Unsupervised methods are used when there are no dependent variables, as in clustering and factor analysis.
- Time-series methods such as ARMAX and GARCH are needed when the data have been collected at many points in time, for instance weekly or daily sales figures. Marketing researchers are generally better acquainted with cross-sectional research, such as one-time consumer surveys. Regression, discriminant analysis and factor analysis are techniques commonly used to analyze cross-sectional data.
- Association pattern mining, used to rationalize shelf placement, and for recommender systems.
- There are also many specialized methods for text analytics, social network analysis, Web analytics, mining streaming data and anomaly detection (e.g., for detecting credit card fraud).for more... for more...