Event box
Parallel Computing and Working with Big Data in R In-Person / Online
Overview:
This workshop introduces key strategies and tools for scaling your R workflows to larger data sets and faster computation. You will learn how to execute code in parallel using packages such as mirai, parallel, and foreach, and understand when and why parallelisation can accelerate your analyses. The workshop also explores approaches to handling and analysing large data sets efficiently. You will work with the high-performance data.table package, learn how to access and query external databases using the DBI interface, and discover modern, out-of-memory data handling with arrow and duckdb.
Prerequisites:
Intermediate R knowledge; ideally, completion of CDSI workshops on data wrangling or equivalent experience.
Learning outcome:
At the end of this workshop, you will be able to
- understand core concepts and typical use cases of parallel computing in R
- implement parallel workflows using
mirai,parallel, andforeach
- manipulate and analyse large data efficiently with
data.table
- connect to and query databases using the
DBIpackage
- work with out-of-memory and tables using
arrowandduckdb
Location: HYBRID. Online via Zoom, or in-person at Burnside Hall room 1104 (11th floor).
Instructor: Tim Elrick, Faculty Lecturer in the Faculty of Science.
IMPORTANT NOTE: for those attending the workshop online, you must log-into your McGill Zoom account. Otherwise, the link will not work. You can do so on the browser here: https://mcgill.zoom.us/ . For more information on McGill Zoom, please visit this website: https://www.mcgill.ca/tls/students/learning-resources/use-technology/learning-zoom
- Date:
- Friday, February 27, 2026
- Time:
- 10:30am - 12:30pm
- Location:
- Burnside 1104 (11th floor) (Map )
- Categories:
- Data Science R R
