|BUS 41201 is a course about data mining: the analysis, exploration, and simplification of large high-dimensional datasets. Students will learn how to model and interpret complicated data and become adept at building powerful models for prediction and classification.
Techniques covered include an advanced overview of linear regression, model choice and false discovery rates, binary and multinomial
regression, classification, decision trees, partial-least squares and principle components, factor analysis, clustering and K-means. We
learn both basic underlying concepts and practical computational skills.
Heavy emphasis is placed on analysis of actual datasets, and on development of application specific methodology. Among other examples, we will consider consumer database analysis, on-line behavior tracking, network analysis, and text mining.