Find the subject in the handbook here Semester 2
Subject Description
Computing techniques and data mining methods are indispensable in modern statistical research and data science applications, where "Big Data" problems are often involved. This subject will introduce a number of recently developed methods and applications in computational statistics and data science that are scalable to large datasets and high-performance computing. The data mining methods to be introduced include general model diagnostic and assessment techniques, kernel and local polynomial nonparametric regression, basis expansion and nonparametric spline regression, and generalised additive models. Important statistical computing algorithms and techniques used in data science will be explained in detail. These include unsupervised learning of meaningful components, bootstrap resampling and inference, cross-validation, the EM algorithm and variational approximation, and Markov chain Monte Carlo methods including adaptive rejection and squeeze sampling, sequential importance sampling, slice sampling, Gibbs samplers and the Metropolis--Hastings algorithm.
Assignments
Comments