Data Mining

Subject MAST90028 (2012)

Note: This is an archived Handbook entry from 2012.

Credit Points: 12.50
Level: 9 (Graduate/Postgraduate)
Dates & Locations:

This subject is not offered in 2012.

Time Commitment: Contact Hours: 36 hours comprising one 2-hour lecture per week and one 1-hour practical class per week.
Total Time Commitment: Not available
Prerequisites:

None

Corequisites:

None

Recommended Background Knowledge:

It is recommended students complete a second year statistics subject equivalent to the following; and have had some exposure to computer packages:

Subject
Study Period Commencement:
Credit Points:
Semester 2
12.50
Non Allowed Subjects:

None

Core Participation Requirements:

For the purposes of considering requests for Reasonable Adjustments under the Disability Standards for Education (Cwth 2005), and Students Experiencing Academic Disadvantage Policy, academic requirements for this subject are articulated in the Subject Description, Subject Objectives, Generic Skills and Assessment Requirements for this entry.

The University is dedicated to provide support to those with special requirements. Further details on the disability support scheme can be found at the Disability Liaison Unit website: http://www.services.unimelb.edu.au/disability/

Contact

Melbourne Graduate School of Science
Faculty of Science
The University of Melbourne
Victoria 3010

Tel: + 61 3 8344 6128
Fax: +61 3 8344 3351

Web: http://graduate.science.unimelb.edu.au/

Subject Overview:

Data Mining refers to the management and analysis of large data sets.

Data Mining became possible with the advent of large-scale data collection and the computing power necessary to process it. It involves all of the following steps:

  1. Data Warehousing.
  2. Data Cleaning.
  3. Data Description and Visualisation.
  4. Data Analysis and Interpretation

This course deals only with step 4 of the Data Mining process: data analysis and interpretation. It considers techniques for Rule Finding, Classification, Regression and Clustering. The themes that run through the course are:

  1. Model fitting and selection and how to avoid overfitting.
  2. Scalable algorithms that can be used with very large data sets.
  3. How to acommodate high-dimensional data.
  4. Actionability and interpretability of models

Objectives:

After completing this subject, students should:

  • understand many of the techniques used to analyse large data sets;
  • have acquired skills and techniques widely used in modern data mining; and
  • have gained the ability to pursue further studies in this and related areas.
Assessment:

Up to 40 pages of written assignments (20%: two assignments worth 10% each, due mid and late in semester), a 3-hour written examination (80%, in the examination period).

Prescribed Texts:

None

Recommended Texts:

TBA.

Breadth Options:

This subject is not available as a breadth subject.

Fees Information: Subject EFTSL, Level, Discipline & Census Date
Generic Skills:

Upon completion of this subject, students should develop:

  • problem-solving skills (especially through tutorial exercises and assignments) including engaging with unfamiliar problems and identifying relevant strategies;
  • analytical skills including the ability to construct and express logical arguments and to work in abstract or general terms to increase the clarity and efficiency of the analysis; and
  • ability to work in a team, through interactions with other students.

Download PDF version.