DSCI 6000 - Applied Statistics and Data Science
Admissions Requirement: Introduction to statistics
Course Restrictions: Restricted to Graduate Students.
This course offers an overview of three distinct yet interconnected perspectives: Classical statistics, Bayesian statistics, and data Science/machine learning (DSML). Classical statistics emphasizes rigorous inferences rooted in the frequentist school whereas the Bayesian school offers a probabilistic framework that enables the incorporation of prior knowledge, updating beliefs, and modeling uncertainty. DSML aims to extract insights and patterns from data and building predictive models.
Credit: 3
DSCI 6100 - Programming for Data Scientists (Python)
Admissions Requirement: Introduction to computing/programming. Students can fulfill the requirement by taking any one of the following or an equivalent course (instructor's approval is needed): CSCI 1041 - Digital Literacy in a Global Society, CSCI 1611 - A Gentle Introduction to Programming, CSCI 1911 - Foundations of Programming, CSCI 2651 - Python for the Sciences.
Course Restrictions: Restricted to Graduate Students.
An introduction to programming in the popular Python programming language as it is applied to data science. Topics include data types, simple statements, control structures, strings, functions, recursion, the Python interpreter, system command lines and files, module imports, object types, dynamic typing, scope, classes, operator overloading, exceptions, testing, and debugging. The course will enable students to program fluently in Python for data science, and move on to advanced topics such as programming artificial intelligence and natural language processing.
Credit: 3
DSCI 6200 - Data Science and Machine Learning
Prerequisite: DSCI 6000
Course Restriction: Restricted to Graduate Students
This course provides an overview of modern data science and machine learning (DSML) techniques, contrasting them with a traditional statistical approach. Students will learn how analysts can transition from classical statistics to more advanced predictive modeling and algorithmic data analysis. The course will cover both the theoretical and applied aspects of powerful DSML tools, such as neural networks, support vector machines, decision trees, random forest, gradient boosting, XGBoosting, model selection, model averaging, cluster analysis, and text mining. Upon completing this course, students will leverage modern modeling techniques to extract insights, predict outcomes, and optimize decisions.
Credit: 3
DSCI 6300 - Data Visualization
Course Restrictions: Restricted to Graduate Students
This course covers principles and tools for effectively visualizing and communicating data-driven insights. The focus will be on extracting and communicating patterns from data through interactivity and synthesis of complex information. Aligned with the exploratory data analysis paradigm, emphasis will be placed on using visualizations to ask and answer "what-if" questions about data. Topics of this course include, but are not limited to, univariate data visualization, high-dimensional data visualization, visualization for trend-based data, visualization for spatial data, and dashboarding. Through hands-on assignments, students will gain skills in creating insightful, impactful data graphics using leading dynamic visualization tools.
Credit: 3
DSCI 6400 - Ethics in Data Science and Artificial Intelligence
Course Restrictions: Restricted to Graduate Students.
This course provides an overview of ethical issues related to data, with a particular emphasis on artificial intelligence, machine learning, and big data. Students will gain an understanding of current debates, frameworks, and regulations regarding data ethics. Key topics include privacy and confidentiality, transparency and explainability, bias and fairness, copyright and intellectual properties, as well as misuse prevention and safety.
Credit: 3
DSCI 6600 - Data wrangling with Structured Query Language (SQL)
Course Restrictions: Restricted to Graduate Students.
This hands-on course will provide students with the skills to wrangle, clean, transform, and munge data using Structured Query Language (SQL). Students will learn SQL programming techniques to deal with common data issues such as missing values, duplicate records, parsing errors, inconsistent formats, and integrating from different sources.
Credit: 3
DSCI 6700 - Text Mining and Unstructured Data
Prerequisite: DSCI 6600
Course Restrictions: Restricted to Graduate Students.
This course introduces techniques for extracting insights from unstructured textual, visual, audio and video data. Students will learn text mining tools to analyze patterns in textual corpora, as well as acquire skills for organizing and making sense of other unstructured data types. Topics include, but are not limited to, text mining algorithms like classification, clustering, and sentiment analysis, Web scraping and collection of online text data, audio and video feature extraction techniques, as well as image classification and object recognition. Through hands-on assignments and projects, students will gain practical experience applying text mining, computer vision, and other unstructured data analysis techniques on real-world datasets.
Credit: 3
DSCI 6800 - AI and Machine Learning
Prerequisite: DSCI 6100 and DSCI 6200
Course Restrictions: Restricted to Graduate Students.
This course provides a broad overview of the fields of artificial intelligence and machine learning. Students will learn fundamental concepts and algorithms that enable computers to mimic human intelligence for tasks such as pattern recognition, prediction, optimization, and decision-making. Topics in this course include, but are not limited to, supervised learning algorithms, unsupervised learning algorithms, reinforcement learning for sequential decision-making, deep learning using multiple hidden layers, natural language processing for text and speech, computer vision for image and video processing, generative AI (ChatGPT, Midjourney, Stable Diffusion, etc.), ethical practice of AI, biases, and social impact. In this course, students will gain hands-on experience applying AI techniques and machine learning algorithms in building intelligent systems. Programming will be done in languages such as Python.
Credit: 3
DSCI 7000 - Data Science Capstone
Prerequisite: Permission of instructor.
Course Restrictions: Restricted to students in the Master of Science in Data Science program. Restricted to Graduate Students.
This capstone course provides the culminating experience for students in the Master's in Data Science program. Soft skills such as effective communication are indispensable, and therefore teamwork is strongly recommended over individual projects. Students will conceptualize, propose, and execute an end-to-end data science project using real-world big data. The project will integrate skills and concepts learned throughout the program, including statistical analysis, machine learning, and communication of results. Under instructor’s guidance, students will identify a problem amenable to data science techniques, acquire appropriate datasets, perform exploratory data analysis, implement data cleaning, and feature engineering pipelines, train machine learning models, and measure model performance. The final project must be approved by a committee consisting of at least two of the MSDS faculty. Students are encouraged to submit the product to a data science conference or a peer-reviewed journal.
Credit: 3