Data Analytics in R - STQM 4000

Data Analytics - Information Technology and Business, School of Business at NCCU, 2022

The goal of this course is to present the foundations of Data Analytics (data I/O, data wrangling, visualization, databases and exploratory data analysis) and help students to develop the necessary skills to perform successful analysis of real life problems. R is used as the statistical programming language in the course along with several techincal libraries.

Offered during the following semester: Fall 2022

List of Tentative Topics:

1. Introduction to Data Analytics
• Administrative matters • Course overview • Download and Installing R • R Studio

2. Data Types and Structures
• Vectors and Matrices • Arrays • Lists • Data Frames

3. Scripts and Functions
• How to use R built-in functions • How to create scripts in R • How to use scripts to solve recurrent problems • Data Handling, Built-in Functions and Looping • Import & Export Data • Logic and Flow control with R • Statistical modeling functions (lm and glm)

4. Statistical Analysis using R
• Data preparation and exploratory data analysis • Probability Distributions and Sampling in R • Using R Markdown and R Shiny in Practice • Visualization in R using Ggplot2 and Plotly

5. Working with large Data Sets
• Handling through databases • SQL and SQLite and its incorporating in R code • GIS and R

6. Final Project: Data Analysis using R