R Programming – Quick Starter

R is a programming language, a software environment for statistical analysis and graphical reporting. Created by Ross Ihaka and Robert Gentleman (hence R). 

Get Started with R

Before we start on this journey, ensure that you have access to an on-premise installation of R or access to an online workspace. If you prefer to install R in Windows System, refer to Installation of R post and Installation of R Studio. If you prefer an online coding environment, you can visit https://rstudio.cloud/ now migrated to https://posit.cloud to register and start using the online development environment.

R Language Basics

R is a case sensitive, interpreted language that is mostly used for statistical analysis..

Comments

Comments are useful text that can be written in the program to ensure better readability and are ignored by the interpreter. R supports single line comments where the comment text is preceded by the # symbol.


#Comments start with hash symbol and continues till end of line.
# Multi-Line comments are not supported in R

In case there is an absolute need to insert a multi-line statement, the following workaround can be used where the interpreter executes the code with no visible impact to the output.

if(FALSE) {
  "Insert the Multiline Comment
   in either single or double quotes.
   This code block is executed but without any tangible output"
}

Variables

Variables are used to store data that can be categorized into any of the below supported data types. The unique name given to an R Object is called identifier and the following rules must be adhered to while defining the identifiers.

  • Identifiers can be a combination of letters, digits, period (.) and underscore (_) but cannot start with a number or underscore (_)
  • It must start with a letter or a period. If it starts with a period (.), it cannot be followed by a digit.
  • The identifiers cannot have special symbols such as ^, !, $, @, +, -, / and *.
  • Reserved words in R cannot be used as identifiers.
  • R is case sensitive so identifiers account and Account are treated as two different identifiers.

Valid Identifiers

Total
Sum 
.fine.with.dot 
this_is_acceptable 
Number5 
average32. 
#can start with . as long as it is not followed by numbers
.result.percentage 
#the . and number combination can appear in the middle but not in the beginning
results.2percentage

Invalid Identifiers

tot@l
5um
_fine
TRUE 
.0ne 
result% (only . or _ can be used), 
.2nd.iteration (can start with .  but cannot be followed by number)

Variable Assignment

In R, variables can be assigned values in 3 ways.

  • Leftward Assignment (<-)
  • Rightward Assignment (->)
  • Using equal to operator (=)

Leftward Assignment

In this pattern the variable identifier is on the left hand side followed by the leftward assignment operator (<-) followed by the value to be assigned.

variable_name <- value
integer_data <- 38
pi <- 3.14

Rightward Assignment

In this pattern the value to be assigned is on the left hand side followed by rightward assignment operator (->) followed by the variable identifier.

value -> variable_name
38 -> integer_data
3.14 -> pi

Equal to Operator

In this pattern, the variable name is on the left hand side followed by the equal to operator (=) followed by the value to be assigned.

variable_name = value
integer_data = 38
pi = 3.14

Data Types in R

Unlike other programming languages like C, Java, C#, R objects are not declared with a data type. The variables are directly assigned R Objects and the data type of R Objects will be the data type of the variable.

R Objects

The complete list of R objects are listed below:

Class(es) Object Type(s)Example
logical logicalTRUE, FALSE
numeric double1, 0.33, 1e4
integer integeras.integer(1)
character character “Hello world”
list listlist(a =1, b=”data”)
complex complex3+.5i
raw raw as.raw(c(1,2,3))
expression expressionexpression(x, 1)
functionclosurefunction(x) x+1
builtin`mean`
special`if`
calllanguagequote(x+1)
“{“, etc. (many)S4New(“track”)
namesymbolquote(x)
environmentenvironment.GlobalEnv
Table representing different type of R Objects

The most frequently used R objects are:

  • Vectors
  • Lists
  • Matrices
  • Arrays
  • Factors
  • Data Frames

Atomic Vectors

The simplest of these R Objects is the Vector Object and there are six data types of these atomic vectors, also termed as six classes of vectors. The other R Objects are built upon these atomic vectors. Please note that the vector objects of any of these data types can be used to build another type of R object, say Arrays, Lists, Factors etc.

The supported data types of atomic vectors are listed as below:

  • Logical 
  • Numeric
  • Integer
  • Complex
  • Character
  • Raw

The data types are described in detail below:

Logical

R supports logical data types TRUE and FALSE. R Also assumes T, F, t, f as logical values. When other cases like True, False, true, false are used the value is still taken as a logical value. However the Lower case and camel cases true, false, True, False can also be used as valid identifiers. 

#Usage of Logical data types.
>isValid <- TRUE
>isSeniorSitizen <- FALSE
#false is a valid identifier
>false <- T

To identify the type of a data type or data structure, typeof(), class() and mode() can be used. All of these commands yield the output “Logical”.

>print(typeof(isValid))
>print(mode(isValid))
>print(class(isValid))

Numeric

Numeric values are base 64 representation of any numeric values including but not limited to integer values, negative values, floating point values etc. Any numerical value is automatically considered as data type “Numeric”

>rateOfInterest <- 8.4
>loanPeriod <- 5
>amount <- 1000
>depriciation <- -5.4

The class(), mode() and typeof() can be used to find the data type, data structure details. The class() and mode() commands yield “Numeric” but typeof() yields “double”.

There are 2 functions of interest that can be explored to understand more about numeric data types, is.numeric() and as.numeric()

Integer

To specifically indicate that data should be represented as Integer only, a suffix L needs to be appended to the value. 

#Explicitly create integer data
>intValue <- 54L
>period <- 2L

Note: Any attempt to assign floating point values with suffix L will result in an error and a warning message.

The functions class(), typeof() yield value “integer” whereas  mode() yields “numeric” as a result.

Complex

Complex values consisting of real and imaginary numbers are supported in R. 

#Complex numbers in R
>complex_number1 <- 2+3i
>complex_number2 <- -6i
#The below value is taken as numeric as the imaginary number is missing.
>complex_number3 <- -3

The functions class(), typeof() and mode() yield the value “complex”

Character

Characters are any text or numerical values enclosed in either single or double quotes.

#Character data type in R
>fruit <- 'apple'
>rateOfInterest <- "7.6"
>isValid <- "False"

The functions class(), mode() and typeof() yield the value “character”

Raw

R supports storing information in raw format. 

#R supports storing data in raw format.
#The string “Hello” is stored as 48 65 6c 6c 6f
> rawData <- charToRaw(“Coders Pad”)
>rawData
[1] 43 6f 64 65 72 73 20 50 61 64

The functions class(), mode() and typeof() yield the value “raw”.

Constants in R

Constants are entities whose value cannot be changed. Constants in R are of two types

  1. Numeric (Integer, Double & Complex)
  2. Character 

Integer constants are post fixed with L and complex constants are post fixed with i. Please note that since R is case sensitive, lower case l and upper case I are not valid.

Hexadecimal numbers are represented by preceding the numeric constants with 0x or 0X

Data conversion from one type to another

Data conversion from one data type to another data type is possible in R, some conversions without loss of data and some are coerced data conversions.

Logical to Numeric

The logical types can be converted to numeric types where TRUE is coerced to 1 and FALSE to 0. Converting the logical types to character types is also possible, but does not provide any practical value to the programmers.

Integer Conversions

Integer types can be converted to logical types with non zero values coerced to TRUE and zero value converted to FALSE. This can typically be used in if expressions.

#In R,non-zero values are auto coerced to TRUE and zero is coerced to FALSE
#The below statements are valid
>if(25) {
          print("This text is printed as 25 evaluates to true")
}
>if(0) {
         print("This statement is ignored as 0 evaluates to FALSE")
}

Integer types can be converted to character types, however there is not much value in this conversion

Integer types can be converted to numeric types as well. 

#Integer values can be converted to numeric type.
>intValue <- 34L
>numValue <- as.numeric(intValue)

The conversion can be checked using typeof(numValue)

Numeric Conversions

Numeric types can be converted to logical types with non zero values coerced to TRUE and zero values coerced to FALSE

Numeric types can be converted to integer types, with no change in whole numbers but the numbers with decimal values will be truncated to make them integer values. 

When vectors are created with mixed types of elements, the elements are coerced into the most viable format automatically. 

#The logical elements are coerced into numeric format with TRUE converted to 1 and FALSE converted to 0

The mode(), class(), typeof() on the vector num yields “numeric”, “numeric” and “double” respectively.

Using complex number in the mix will coerce numeric, logical and complex types to complex types.

The typeof(), class() and mode() functions on the num will yield the below results.