Event box

Parallel Computing and Working with Big Data in R In-Person / Online

Overview: 

This workshop introduces key strategies and tools for scaling your R workflows to larger data sets and faster computation. You will learn how to execute code in parallel using packages such as mirai, parallel, and foreach, and understand when and why parallelisation can accelerate your analyses. The workshop also explores approaches to handling and analysing large data sets efficiently. You will work with the high-performance data.table package, learn how to access and query external databases using the DBI interface, and discover modern, out-of-memory data handling with arrow and duckdb.

Prerequisites:

Intermediate R knowledge; ideally, completion of CDSI workshops on data wrangling or equivalent experience.

Learning outcome:

At the end of this workshop, you will be able to

  • understand core concepts and typical use cases of parallel computing in R
  • implement parallel workflows using mirai, parallel, and foreach
  • manipulate and analyse large data efficiently with data.table
  • connect to and query databases using the DBI package
  • work with out-of-memory and tables using arrow and duckdb

Location: HYBRID. Online via Zoom, or in-person at Burnside Hall room 1104 (11th floor).
Instructor: Tim Elrick, Faculty Lecturer in the Faculty of Science.

IMPORTANT NOTE: for those attending the workshop online, you must log-into your McGill Zoom account. Otherwise, the link will not work. You can do so on the browser here: https://mcgill.zoom.us/ . For more information on McGill Zoom, please visit this website: https://www.mcgill.ca/tls/students/learning-resources/use-technology/learning-zoom

Date:
Friday, February 27, 2026
Time:
10:30am - 12:30pm
Location:
Burnside 1104 (11th floor) (Map )
Categories:
  Data Science R     R  

Registration is required. There are 60 in-person seats available. There are 100 online seats available.

Event Organizer

Profile photo of CDSI Staff
CDSI Staff

More events like this...