The amount of data in our world has been exploding, resulting in what is popularly known as Big Data. At least three major forces are driving the interest and growth in Big Data (1) a rapid increase in the amount of data being generated on the internet, (2) the evolving strategy of firms to collect data from sources both internal and external along the entire product and process lifecycle, and (3) the phenomenal growth of social media, mobile applications, and sensor based technologies as well as the Internet of Things. All of these forces are generating a flood of data which is increasing in volume, variety and velocity.
The objective of this course is to introduce students to Data Science techniques to collect, process, visualize and analyze all kinds of “Big Data”. It will provide training to those interested in becoming Data Scientists. The course will delve into Web analytics and students will be exposed to tools such as Google analytics and participate in a Google Online Challenge to compete for awards. Topics related to network analysis techniques will be covered in detail where students will learn how to construct, mathematically analyze and visualize different types of networks. Additionally, students will also learn about using MongoDb, Hadoop, and executing map-reduce jobs to process and analyze large datasets collected from social media sites such as Twitter, Youtube, and Facebook.