Subject code: MA5831:03
This subject will provide students with advanced data processing and analysis skills using the commercial SAS software and its advanced packages. It will introduce students to recognizing and overcoming the challenges associated with big data and analysis-driven data including data preparation, storage, processing, and the combination of SAS and Hadoop. It will also give students skills in SAS using parallel processing in both multi threaded single-machine mode and distributed multiple machine mode to solve big data problems.
Software platform: various
- Compare and evaluate different systems and approaches for high-performance and large-scale computing for analytics for standard data and big data; - week 1
- Manage and prepare data using standard management frameworks for the purpose of transforming, cleaning to ensuring classical characteristic outcomes are achieved
- Perform data management tasks to improve data quality, entity resolution and data monitoring
- Examine and deploy data processing tasks in the Hadoop ecosystem for big data and critically evaluate the combination of Hadoop and SAS to overcome big data challenges
- Choose and apply different techniques and software for distributed and cloud computing of Big Data, such as Pig, Hive, and MapReduce
- Conduct a review of a current data processing technology and communicate the insights to relevant stakeholders
Assessment for this course will occur at various times across the seven-week study period. Tasks may include online quizzes, discussion board activity, portfolio development, case studies, reflection, literature reviews presentations and reports.Feedback will be provided to you throughout the study period as well as a final grade at the conclusion of the study period.
This is one of the interdisciplinary subjects studied in the online Master of Data Science.
Please note, course structure and content are subject to change. For information on all course subjects download the course guide.