R - Basic Syntax


R is an interpreted programming language and software environment for statistical analysis, graphics representation and reporting. R also allows integration with the procedures written in the C, C++, .Net, Python or FORTRAN languages for efficiency. R is named so after the creators Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand, and is currently developed by the R Development Core Team.
R-Studio is most commonly used for development. It provides features to run commands as well as scripts. You can install R and R-Studio from their website.

R Command Prompt

Once you have R environment setup, then it’s easy to start your R command prompt by just typing the following command at your command prompt −
> myString <- "Hello, World!"
> print ( myString)
[1] "Hello, World!"
Here first statement defines a string variable myString, where we assign a string "Hello, World!" and then next statement print() is being used to print the value stored in variable myString.

R Script File

Usually, you will do your programming by writing your programs in script files and then you execute those scripts at your command prompt with the help of R interpreter called Rscript. So let's start with writing following code in a text file called test.R
# Hello World
myString <- "Hello, World!"

print ( myString)
You can save the above code in a file helloWorld.R and execute it at the command prompt as given below. You can also do that within the R-Studio if you want to avoid the hassles of setting up the path.
$ Rscript test.R 

[1] "Hello, World!"


Comments are probably the most important part of the syntax of any language. Like most scripting languages, comments in R are marked by #. R does not support multi-line comments but you can perform a trick which is something as follows:
if(FALSE) {
   "This is a demo for multi-line comments and it should be put inside either a 
      single OR double quote else it will cause compilation errors"
Though above comments will be executed by R interpreter, they will not interfere with your actual program. You should put such comments inside, either single or double quote.

Data Types

Like most scripting languages, variables in R are not hard typed. You do not declare a variable to be limited to a given data type. The variables in R are assigned with R-Objects and the data type of the R-object becomes the data type of the variable. There are many types of R-objects. The frequently used ones are:
  • Vectors
  • Lists
  • Matrices
  • Arrays
  • Factors
  • Data Frames
The simplest of these objects is the vector object and there are six data types of these atomic vectors, also termed as six classes of vectors. The other R-Objects are built upon the atomic vectors.
  • Logical (TRUE / FALSE)
  • Numeric (1, 2, 33.44)
  • Integer (1L, -100L, 0L
  • Complex (4 + 10i))
  • Character ("1", "examples", 'of', "characters")
  • Raw (A raw sequence of bytes.)
v <- TRUE 
[1] "logical" 

v <- 23.5
[1] "numeric"

v <- 2L
[1] "integer"

v <- 2+5i
[1] "complex"

v <- "TRUE"
[1] "character"

v <- charToRaw("Convert Characters to RAW")
[1] "raw"


A valid variable name consists of letters, numbers and the dot or underline characters. The variable name starts with a letter or the dot not followed by a number.
var_name2. # Valid Has letters, numbers, dot and underscore
var_name% # Invalid Has the character '%'. Only dot(.) and underscore allowed.
2var_name # Invalid Starts with a number
.var_name # Valid Can start with a dot(.) but the dot(.)should not be followed by a number.
var.name # Valid Variable name can contain a dot(.)
.2var_name # Invalid The starting dot is followed by a number making it invalid.
_var_name # Invalid Starts with _ which is not valid
The variables can be assigned values using leftward, rightward and equal to operator.
# Assignment using equal operator.
var.1 = c(0,1,2,3)

# Assignment using leftward operator.
var.2 <- c("learn","R")

# Assignment using rightward operator.   
c(TRUE,1) -> var.3

cat ("var.1 is ", var.1 ,"\n")
cat ("var.2 is ", var.2 ,"\n")
cat ("var.3 is ", var.3 ,"\n")

print / cat

We can view the contents of a variable using the print or cat functions. Print takes a single parameter while cat takes multiple parameters and concatenates them all.
print("Hello World")
[1] "Hello World"

cat("Hello", "World")
Hello world

ls() / rm()

R does not provide for namespaces. (You can import certain packages to enforce namespaces). For example, a variable declared in an if block is also available after you come out of the block. It is very easy to lose track of the variables available at a given point. R provides for two very useful functions to deal with this.
ls() gives a list of variables defined at a given time. And, if you want, you can also delete a variable from the memory using the rm()


R is quite rich in operators it provides. Not be as rich as Perl - but, the operators in R are taylored towards handling chunks of data. By default all the operators when applied on vectors perform the operation on individual corresponding elements. Operators in R can be classified into 5 major types:


R defines these arithmetic operators +, -, *, /, %%, %/%, ^
The meaning of +, -, *, /, ^ is the same as in most other languages. That does not need any clarification. The %% and %/% are more interesting. Both are related to integer division. One gives the quotient and the other gives the remainder
> # / performs the usual division
> c(4, 2, 5.5, 6.5) / c(2, 4, 2.5, 3)
[1] 2.000000 0.500000 2.200000 2.166667

> # %% gives the remainder. Note that both the operands could be non integers. 
> # But the operator ensures integer division.
> c(4, 2, 5.5, 6.5) %% c(2, 4, 2.5, 3)
[1] 0.0 2.0 0.5 0.5

> # %/% gives the quotient. Note that both the operands could be non integers. 
> # But the operator ensures integer division.
> c(4, 2, 5.5, 6.5) %/% c(2, 4, 2.5, 3)
[1] 2 0 2 2


R defines the usual relational operators: <, >, =<, >=, ==, !=
They mean almost what they would mean in any other language. But, as mentioned above, the operators work on individual elements in the vector and produce another vector of boolean elements that stand for the result of each individual comparison. For example:
> c(4, 2, 5.5, 6.5) < c(2, 4, 2.5, 3)


R defines all the usual logical operators: &, |, !, && and ||
The operators &, | and ! do just what one would expect - operate on individual elements of the operand vectors and produce another boolean vector as result. But the && and || work differently - They just operate on the first elements of the vectors and return a single boolean value based on that.


There are two types of assignments in R. Left assignment and Right assignment.
> a <- c(1,2,3)
> a
[1] 1 2 3
> c(3,4,5,6) -> a
> a
[1] 3 4 5 6
You can also use <<-, ->> and ofcourse = There are subtle differences between these - we will check them out down the line.


R also provides other operators :, %in% and %*%
[1] 2 3 4 5 6 7 8

print(8 %in% 1:10) 
[1] TRUE
print(12 %in% 1:10)
These are not limited to numbers. They work as well on other data types.


if / else / else if

R defines the standard if / else if / else flow for conditional operations.
output <- 'blank'
number <- 10

if(number > 10){
    report <- "Greater than 10"
 }else if (number < 10){
    report <- "Less than 10"
    report <- 'Equal to 10'

[1] Equal to 10

for loops

We have versatile for loops in R. It provides ways of looping through the various data structures like vectors, lists, matrix, arrays...
vec <- c(1,3,4,6,9)
for (v in vec) {

[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
You can do the same with other data structures as well. Also the collection can be generated dynamically in the command:
for ( i in 1:10 ){
    print (i)

[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9
[1] 10

While Loops

While loops provide a more generic and more powerful mechanism to loop. The While loops in R are quite similar to most other languages:
> x <- 0
> while(x < 10){
+   cat('Value of x: ',x)
+   print("X is still less than 10")
+   # add one to x
+   x <- x+1
+ }
Value of x:  0[1] "X is still less than 10"
Value of x:  1[1] "X is still less than 10"
Value of x:  2[1] "X is still less than 10"
Value of x:  3[1] "X is still less than 10"
Value of x:  4[1] "X is still less than 10"
Value of x:  5[1] "X is still less than 10"
Value of x:  6[1] "X is still less than 10"
Value of x:  7[1] "X is still less than 10"
Value of x:  8[1] "X is still less than 10"
Value of x:  9[1] "X is still less than 10"
While loops also provide for break and next if you want to cut short through the loop at any point.


If you are interested in going deeper, check out this video tutorial on YouTube.
If you like books, check this out.