It is widely anticipated that in the future, managers will increasingly use analytics to make business decisions. Concurrently, the number of tools and applications of analytics to turn data into insights is growing day by day. This course introduces students to this toolset in a gradual manner and with business applications in mind.
Students will gain an understanding of the basic methods of business analytics by working with different tools and data sets. The course will emphasize applications over the mathematics of the methods. Students will get to apply methods using Rattle, a menu driven software.
The course will begin with what is familiar to many business managers and those who have taken the first course in this specialization. The first set of tools will explore data description, statistical inference, and regression. We will briefly extend these concepts to other statistical methods used for predicting consumer behavior and forecasting. In the next segment, students will learn about tools used for identifying important features in the dataset that can either reduce the complexity, help identify important features of the data or further help explain behavior. The instructors will then explain data mining concepts that are used to make predictions and classifications. The final segment will be devoted to understanding why, and how, we can learn from data.
Successful use of business analytics and data mining requires both understanding of the business context where value is to be captured, and an understanding of exactly what the methods can do. Some examples include: Statistical tools – Multiple Linear Regression and Logistics regression are used to construct models for estimation and prediction. Business Example-1: A credit score should not be some arbitrary judgement of credit worthiness; a predictive statistical model that uses prior data can help prediction of repayment behavior. Similar applications abound in every area of business such as predicting which orders are most likely be delayed based on recent deliveries from the suppliers or which assets are most likely to increase in value, and predicting the choice behavior of airline customers.
Forecasting Time Series is intended to help managers do a better job of anticipating future events, hence a better job of managing uncertainty by using effective forecasting techniques.
Business Example-2: In making decisions under uncertainty, where time and resources are directly related, forecasting capability becomes critical. Predicting aggregate demand, responses to promotional offers, property values, wages, cost of inputs, and other variables that affect business fall within the scope.
Data Exploration and Dimension Reduction – Understanding and conquering the curse of dimensionality is important to make sense of data. This segment is to help managers appreciate that along with richness of data comes the challenge of making sense of relationships between data points and exploiting these relationships to construct simpler and powerful models.
Business Example-3: There has been a tremendous increase in the way of data generation via sensors, digital platforms, user-generated content etc. are being used in the industry. For example, sensors continuously record data and store it for analysis at a later point. In the way data gets captured, there can be a lot of redundancy. With more variables, comes more trouble! There may be very little (or no) incremental information gained from these sources. This is the problem of high unwanted dimensions. And to avoid this trouble, data exploration and dimension reduction comes to the rescue by examining and extracting lesser dimensions ensuring that it conveys similar information concisely.
Clustering Analysis – Data becomes more manageable when interrelationships can be easily understood. The goal of clustering is to segment data into a set of homogenous clusters by mining relationship among records to identify similar groups. Cases which include market segmentation analysis, and extracting insights for customer intelligence, are some of the techniques covered in this part of the course.
Business Example-4: Understanding the boundaries among clusters/groups for the purpose generating insight is used in a vast variety of business applications, from customized marketing to industry analysis. For instance, in retail businesses, data clustering helps with customer shopping behavior, developing sales campaigns, and customer retention. Other business that use cases include image segmentation, web page grouping, market segmentation, and information retrieval.
Classification Algorithms and Prediction – Data mining techniques can be used to augment classification and prediction toolkits. Prior observations are used to develop rules where the classification is known, which are then applied to the new data with the unknown classification to predict the class of new records (i.e. can a loan applicant repay on time (class-I), repay late (class-II), or declare bankruptcy (class-III)).
Business Example-5: A financial service provider is interested in knowing which customers are likely to default on loan payments. This enterprise is also interested in understanding what characteristics of customers may explain their loan payment behavior. A marketing director is interested in choosing the set of customers or prospects who are most likely to respond to a direct mail campaign. The same officer is also interested in knowing what characteristics of consumers are most likely to explain responsiveness to the campaign. Classification techniques are useful to help answer such questions.
Association rules, recommendation systems and market basket analysis – We study tools for recognizing what opportunities to recommend, identify cross-sell or upsell.
Business Example-6: There is need for identification of shopping patterns to increase the size of a sale. The goal is to make the consumer experience more intuitive and not overwhelming by targeting the right customers with the right products by predicting what individual users would enjoy. More specifically, rather than “what is the relationship between advertising and sales”, we are interested in knowing “what specific advertisement, or recommended product, should be shown to a given online/offline customer at this moment?”