Event box

ML Python session 3 - Unsupervised clustering and data cleaning

ML Python session 3 - Unsupervised clustering and data cleaning In-Person / Online

This workshop will focus on unsupervised machine learning and data cleaning. Unsupervised machine learning is a powerful technique where the algorithm analyzes and clusters unlabeled datasets. This workshop will scratch the surface of this side of machine learning, introducing unsupervised learning using the k-means and DBSCAN algorithms. This session will explore the data cleaning process in the machine learning pipeline in more detail.

By the end of the workshop, participants will be able to:
- Differentiate between supervised and unsupervised learning;
- Given a scaffolded environment and curated data set, train a DBSCAN model and describe how this algorithm works at a high level;
- Articulate the steps in data cleaning, along with the common issues and solutions to incomplete or faulty datasets.

Prerequisites: Participants should already have some familiarity with Python programming fundamentals, e.g. loops, conditional execution, importing modules, and calling functions. Furthermore, participants should ideally have attended the first lesson in the “Fundamentals of Machine Learning in Python” series, or they should already have some background on the general machine learning pipeline.

      · You need to bring your own laptop for this workshop. Contact us if you would like to attend but it's impossible for you to bring a laptop.
      · Install Anaconda on your computer. You can find installation instructions here. Please contact us (cdsi.science@mcgill.ca) if you are having trouble with installation.

Supporting resources: Some materials that will be used are available at the instructors' website Computing Workshop.

Location: Hybrid. Online via Zoom, or in-person at Burnside Hall room 1104 (11th floor).
Instructors: Jacob Errington, Faculty Lecturer, and Eric Mayhew, graduate student, School of Computer Science, McGill University.

Date:
Friday, March 10, 2023
Time:
10:00am - 12:00pm
Location:
Burnside 1104 (11th floor) (Map )
Categories:
  ML&NLP     Python  
Registration has closed.

Event Organizer

Profile photo of CDSI Staff
CDSI Staff

More events like this...