RES.LL-005 | January IAP 2020 | Undergraduate

Mathematics of Big Data and Machine Learning

Course Description

This course introduces the Dynamic Distributed Dimensional Data Model (D4M), a breakthrough in computer programming that combines graph theory, linear algebra, and databases to address problems associated with Big Data. Search, social media, ad placement, mapping, tracking, spam filtering, fraud detection, wireless …
This course introduces the Dynamic Distributed Dimensional Data Model (D4M), a breakthrough in computer programming that combines graph theory, linear algebra, and databases to address problems associated with Big Data. Search, social media, ad placement, mapping, tracking, spam filtering, fraud detection, wireless communication, drug discovery, and bioinformatics all attempt to find items of interest in vast quantities of data. This course teaches a signal processing approach to these problems by combining linear algebraic graph algorithms, group theory, and database design. This approach has been implemented in software. The class will begin with a number of practical problems, introduce the appropriate theory, and then apply the theory to these problems. Students will apply these ideas in the final project of their choosing. The course will contain a number of smaller assignments which will prepare the students with appropriate software infrastructure for completing their final projects.
Learning Resource Types
Lecture Videos
Lecture Notes
Instructor Insights
Many arrays of numbers 0 and 1 in a dark blue background
“Big Data” refers to a technological phenomenon that has emerged since the mid-1980s. As computers have improved in capacity and speed, the greater storage and processing possibilities have also generated new challenges. New analytical tools, including the ones introduced in this course, have since been developed to solve these challenges in management of those phenomenally large data sets. (Image source: DARPA/public domain.)