OrgPad logo

R big picture

Created by Silvie Cinková

R big picture

R

untitled

modern R "dialects"

powerful data analysis and in addition functionalities of general-purpose programming languages such as Python

base R

Iconic data sets

A few data sets are "stored" in R's packages and used over and over when explaining functions or concepts

Some examples:

To see more of them and learn what they describe, type into the RStudio console datasets:: and pres tab. A roll-up menu appears with the individual datasets. Some more data sets can be part of other packages. 

Flow control

data types

... and some others, not important here

https://campus.datacamp.com/courses/free-introduction-to-r/chapter-1-intro-to-basics-1?ex=1

Functions

https://campus.datacamp.com/courses/intermediate-r/chapter-3-functions?ex=1

Data structures

Plotting

vector

https://campus.datacamp.com/courses/free-introduction-to-r/chapter-2-vectors-2?ex=1

 

 

list

https://campus.datacamp.com/courses/free-introduction-to-r/chapter-6-lists?ex=1

data frame

https://campus.datacamp.com/courses/free-introduction-to-r/chapter-4-factors-4?ex=1

factor

https://campus.datacamp.com/courses/free-introduction-to-r/chapter-4-factors-4?ex=1

matrix

https://campus.datacamp.com/courses/free-introduction-to-r/chapter-3-matrices-3?ex=1

Conditionals

https://campus.datacamp.com/courses/intermediate-r/chapter-1-conditionals-and-control-flow?ex=1

Loops

https://campus.datacamp.com/courses/intermediate-r/chapter-2-loops?ex=1

the apply family

Much faster and r-borne option to loops. All elements are handled simultaneously.

Very good, but can be somewhat difficult. Also they very often change the data type (e.g. data frame to matrix) and one has to "repair" the processed object afterwards. 

https://campus.datacamp.com/courses/intermediate-r/chapter-4-the-apply-family?ex=1

map() function in purrr library

Extremely fast and does not change data structures of the elements, like apply functions sometimes do. 

Only learn when you already know very well how to use lists!!!!

messy data to tidy data

tidy data: each variable in one column, each observation in one row

Manage times and dates (durations, intervals, moments)

Rename, merge, split, remove values of categorical variables

Data Visualization

Summarize many observation points into a statistics

Manipulation of tables (R: data frames and tibbles!)

Reporting

Order columns or rows

Merge two tibbles by one or more columns (like SQL)

Select columns or rows

Importing & Cleaning Data

ggplot2

shiny

knitr

dplyr

tidyr

forcats

tibble

lubridate

readr

broom

Probability & Statistics

stringr

Text Mining

Machine Learning

Character Encoding issues

Stylometry

Plagiarism, Text reuse detection

Keyword extraction

Communication with APIs

Topic modeling

Search with Regular Expressions

Network analysis

Programming

Web scraping

httr

rvest

Parsing of standard formats (XML, HTML, JSON)

Parallel computing, Big Data processing

Debugging

Code versioning

Assembling new packages

Flow control

magrittr

purrr

tibble

many libraries on CRAN, BioConductor, GitHub...

These libraries usually use functions from other R libraries. 

Some are written in other programming languages like C++, so you won't understand their code. but they work in R. 

Research-specific procedures