Overview

This unit will give students a solid foundation in contemporary data science best practices using Python. It will cover a hands-on introduction to programming paradigms and fundamental data analysis techniques. Through examples involving real-world data, students will learn data cleaning and validation techniques, data transformation procedures, algorithm design, text analytics, and data visualisation techniques. Students will become familiar with important Python software modules such as Pandas, Matplotlib, and the Natural Language Toolkit (NLTK).

Requisites

Teaching Periods
Location
Start and end dates
Last self-enrolment date
Census date
Last withdraw without fail date
Results released date
HE Block 5
Location
Hawthorn
Start and end dates
08-July-2024
18-August-2024
Last self-enrolment date
08-July-2024
Census date
19-July-2024
Last withdraw without fail date
02-August-2024
Results released date
24-September-2024

Learning outcomes

Students who successfully complete this unit will be able to:

  • Apply coherent and advanced knowledge of how to read, clean, and manipulate data sets.
  • Critically evaluate existing toolkits, and learn how to construct custom algorithms when necessary.
  • Identify research questions and create project outlines.
  • Analyse data sets using basic statistics, visualisations, regression, and topic modelling.

Teaching methods

Hawthorn

Type Hours per week Number of weeks Total (number of hours)
On-campus
Class
6.40 5 weeks 32
Unspecified Activities
Various
9.83 12 weeks 118
TOTAL150

Assessment

Type Task Weighting ULO's
Assignment and Presentation 1Individual/Group 30 - 50% 1,2,3,4 
Final WorkbookIndividual 10 - 20% 
Final WorkbookIndividual 10 - 20% 
Final WorkbookIndividual 10 - 20% 1,2,4 
Final WorkbookIndividual 10 - 20% 1,3,4 
Final WorkbookIndividual 10 - 20% 

Content

  • Basic programming theory
  • Data science best practices
  • Data structures, access and usage
  • Data cleaning and validation
  • Data Visualisation
  • How to validate results
  • Working with Text data (Text Analysis)
  • Data science tools

Study resources

Reading materials

A list of reading materials and/or required textbooks will be available in the Unit Outline on Canvas.