Data Science Tools in the Classroom
Invited
Abstract
Teaching methods for programming and data science-related topics have been evolving faster than ever before. This has been heavily influenced by the fast-growing popularity of cloud-based tools. In this talk, I will provide an overview of tools and techniques that can improve both the learning experience of the students and the instructor’s ability to manage the class and materials. I will discuss the best practices to manage and distribute code and data, as well as the platforms used in a data science project. Among a vast space of competitive solutions, I have decided to use Google products as the primary platform. Google Colaboratory (Colab) will be introduced as a solution to run and share the code. Beyond Colab, I will present an end-to-end data science project on a cloud-based ecosystem, using Google Cloud Platform (GCP). In addition to the essential elements of GCP, I will cover ways to tackle big data problems using Hadoop and Spark, as well as utilizing containerized applications for large scale parallel processing. I will illustrate how I have used GCP in my classes at Boston University and share feedback from the students. Additionally, I will touch on open-source auto-grading tools.
–
Presenters
-
Mohammad Soltanieh-Ha
Boston Univ
Authors
-
Mohammad Soltanieh-Ha
Boston Univ