|
Before we Start
|
|
|
Introduction to R
|
Access individual values by location using [].
Access arbitrary sets of data using [c(...)].
Use logical operations and logical vectors to access subsets of data.
|
|
Starting with Data
|
|
|
Data Wrangling with dplyr and tidyr
|
Use the dplyr package to manipulate dataframes.
Use select() to choose variables from a dataframe.
Use filter() to choose data based on values.
Use group_by() and summarize() to work with subsets of data.
Use mutate() to create new variables.
Use the tidyr package to change the layout of dataframes.
Use pivot_wider() to go from long to wide format.
Use pivot_longer() to go from wide to long format.
|
|
Data Visualisation with ggplot2
|
ggplot2 is a flexible and useful tool for creating plots in R.
The data set and coordinate system can be defined using the ggplot function.
Additional layers, including geoms, are added using the + operator.
Boxplots are useful for visualizing the distribution of a continuous variable.
Barplots are useful for visualizing categorical data.
Faceting allows you to generate multiple plots based on a categorical variable.
|
|
Processing JSON data
|
JSON is a popular data format for transferring data used by a great many Web based APIs
The complex structure of a JSON document means that it cannot easily be ‘flattened’ into tabular data
We can use R code to extract values of interest and place them in a csv file
|