Educational Data Mining with Python and Apache Spark: A Hands-on Tutorial
Title | Educational Data Mining with Python and Apache Spark: A Hands-on Tutorial |
Publication Type | Conference Paper |
Year of Publication | 2016 |
Authors | Agnihotri, Lalitha, Mojarad, Shirin, Lewkow, Nicholas, Essa, Alfred |
Conference Name | Proceedings of the Sixth International Conference on Learning Analytics & Knowledge |
Publisher | ACM |
Conference Location | New York, NY, USA |
ISBN Number | 978-1-4503-4190-5 |
Keywords | Big Data, data mining, educational data mining, exploratory data analysis, learning analytics, machine learning, parallel computing, predictive analytics, pubcrawl170201, python, simulation, Spark, visualization |
Abstract | Enormous amount of educational data has been accumulated through Massive Open Online Courses (MOOCs), as well as commercial and non-commercial learning platforms. This is in addition to the educational data released by US government since 2012 to facilitate disruption in education by making data freely available. The high volume, variety and velocity of collected data necessitate use of big data tools and storage systems such as distributed databases for storage and Apache Spark for analysis. This tutorial will introduce researchers and faculty to real-world applications involving data mining and predictive analytics in learning sciences. In addition, the tutorial will introduce statistics required to validate and accurately report results. Topics will cover how big data is being used to transform education. Specifically, we will demonstrate how exploratory data analysis, data mining, predictive analytics, machine learning, and visualization techniques are being applied to educational big data to improve learning and scale insights driven from millions of student's records. The tutorial will be held over a half day and will be hands on with pre-posted material. Due to the interdisciplinary nature of work, the tutorial appeals to researchers from a wide range of backgrounds including big data, predictive analytics, learning sciences, educational data mining, and in general, those interested in how big data analytics can transform learning. As a prerequisite, attendees are required to have familiarity with at least one programming language. |
URL | http://doi.acm.org/10.1145/2883851.2883857 |
DOI | 10.1145/2883851.2883857 |
Citation Key | agnihotri_educational_2016 |