Stata-Blog

Applied Statistics Using Stata^{®}

Free Online Stata Tips & Tutorials. Data Management; Stata Graphs and Graphics; Data Analysis; Stata Programming; Advanced Statistics

The *macro*, *scalar*, *matrix*, and *dataset* can be automatically transported to R and similarly, R objects of different classes (*data.frame*, *list*, *matrix*, *vector*, *logical*, and *NULL*) can be automatically imported to Stata and updated in real-time.
This level of integration between Stata and R, also allows Stata users to get benefit from other programming languages that can be executed interactively in R environment, such as **C++** using **Rcpp** or **JavaScript** using **V8** packages. Moreover,

The main idea of

Seamless R and Stata integration

You can install the package from GitHub by executing the following command:

. net install Rcall, replace from("https://raw.githubusercontent.com/haghish/Rcall/master/")

You also need to make sure that R statistical software is installed on your machine. **Rcall** package includes the default paths of R on *Microsoft Windows*, *Mac*, and *Linux*. But if you have installed **R** in a different location, then you can define the path to executable **R** using the ** setpath** command as shown below. the

`setpath`

Before defining the R path permanently, make sure

. Rcall setpath "/usr/bin/R"

Finally, for passing Stata data sets to R automatically, you need to install the ** foreign** R package, which you can install it within Stata:

. R: install.packages("foreign", repos="http://cran.uk.r-project.org")

In general, the syntax of the **Rcall** package can be summarized as follows:

Rcall [mode] [:] R-command // calling R in Stata Rcall [subcommand] // managing R Rcall [list] [:] namelist // controling data communication

The syntax of the package is further explained in the following sections.

Rcall can embed R in several modes. The mode can be interactive (not specifying anything),

To enter the R console mode within Stata,
type **Mata** environment. However, with
every R command you execute, Stata obtains the objects from R
simultaniously. Note that similar to mata environment, you cannot
execute R commands from the Do-File Editor when the environment is
running. To execute R from Do-File Editor, you should call R using the

. scalar a = 999 . R:

```
------------------------------------------------- R (type
```**end** to exit) ----------------
. a <- 2*(st.scalar(a))
. a
[1] 1998
. end
---------------------------------------------------------------------------------------

. display r(a)

` 1998`

The interactive mode also supports multi-line code. The

. R:

```
------------------------------------------------- R (type
```**end** to exit) ----------
. myfunction <- function(x) {
+
. if (is.numeric(x)) {
+
. return(x^2)
+
. }
+
. }
. (a <- myfunction(199))
. [1] 39601
. end
---------------------------------------------------------------------------------------

. display r(a)

` 39601`

The **Rcall** package runs R interactively. That is, when you define an object in R, it remains in the memory of R and accissable with the next command. For example:

. R: rm(list=ls()) . R: a <- 10 . R: (a^2)

```
[1] 100
```

the ** vanilla** subcommand runs R non-interactively, which can be imagined as opening R, executing a script, and closing R without saving it. This subcommand is only useful if you want to

`source()`

By default, Rcall returns *rclass* objects from R to Stata and allows passing
Stata objects to R using several functions. However, the package also has a
**sync** mode where it automatically synchronizes the global environments
of Stata and R, allowing real-time synchronization between the two languages,
which consequently replaces the objects whenever they change in either of
the environments.

The **sync** mode allows maximum interactive experience for *numeric* and
*string* scalars and *matrices* in Stata. The mode does not synchronize global macros. In the example below, the value of

. scalar a = 1

. R sync: (a = 0)

` [1] 0`

. display a

` 0`

The same example is repeated **without** sync mode:

. scalar a = 1

. R: (a = 0)

` [1] 0`

. display a

` 1`

The synchronize mode also replaces matrices in R and Stata, when there is a change in the matric in either of the environments. Naturally, new matrices also are synchronized:

. mat drop _all

. mat define A = (1,2,3 \ 4,5,6)

. Rcall sync: B = A

. mat list B

```
B[2,3]
c1 c2 c3
r1 1 2 3
r2 4 5 6
```

. mat C = B/2

. R sync: C

```
[,1] [,2] [,3]
[1,] 0.5 1.0 1.5
[2,] 2.0 2.5 3.0
```

As shown in the examples, any change made to the matrices, whether it has
happened in R or Stata will be instantly available in the other environment.
While such a level of integration between the two languages is **exciting**,
it requires a lot of caution and testing. This is rather an exploratory
feature which is not a main-stream approach to calling a foreign language
in a programming language.

The **Rcall** command can also be abbreviated as **R** and the colon sign "**:**" is optional. For for the rest of the examples, I call R by typing **R:** instead of **Rcall:**.

The biggest advantage of **Rcall** package is that it allows data communication between Stata and R. Variables that are defined in R, can be accessed in Stata automatically, within the returned *rclass* scalars, macros, and matrices. For example, I create a numeric variable, a numeric vector, a character variable, a matrix, and a list in R, and retreive the results in Stata simultaniously as shown below

. R: a <- 99 . display r(a)

```
99
```

. R: b <- "hello world" . display r(b)

```
hello world
```

note that the vector is returned as a "*string*" macro in Stata. But you can destring it easily. Stata does not return *rclass* numeric lists (to my knowledge). Nevertheless, if you want to access an R vector in Stata, now you can...

. R: c <- c(1:5) . display r(c)

```
1 2 3 4 5
```

Excitingly, you can also create a Matrix in R and access it simultaniously in Stata, anytime you make a change to it. For example:

. R: A = matrix(1:6, nrow=2, byrow = TRUE) . R: A

```
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
```

And now view the matrix in Stata! That simple!

. mat list r(A)

```
r(A)[2,3]
c1 c2 c3
r1 1 2 3
r2 4 5 6
```

Accessing Lists is more tricky, but yet, automatically possible. Stata returns each element of a list as a separate *rclass* scalar or macro. The biggest difference is that *rclass* **cannot include $ sign in the name**.

`$`

. R: mylist <- list(x="character", y=c(1:10)) . display r(mylist_x)

```
character
```

. display r(mylist_y)

```
1 2 3 4 5 6 7 8 9 10
```

As noted earlier, without using the ** vanilla** subcommand, R is executed interactively within Stata and "most" of the R objects are accessible in Stata. You can see the list of available objects in Stata by typing

`return list`

. return list

```
scalars:
r(a) = 99
macros:
r(mylist_y) : "1 2 3 4 5 6 7 8 9 10"
r(mylist_x) : "character"
r(b) : "hello world"
r(c) : "1 2 3 4 5"
matrices:
r(A) : 2 x 3
```

So far I documented how R variables can be accessed within Stata. This package is under constant development and I will be able to automatically import more R classes to Stata (currently it only imports *numeric*, *character*, *matrix*, and *list*).

Now I show how to pass data from Stata to R. In general, passing local and global macro is the simplest:

. global a 2016

. R: a <- $a

. display r(a)

```
2016
```

But when it comes to *scalar*, *matrix*, and *data sets*, it becomes more complicated. Similar to passing *scalar*, *matrix*, or *data sets* to **Mata**, the **Rcall** defines 3 functions for passing these classes to R.

function | description |
---|---|

`st.scalar()` |
passes a numeric or string scalar to R |

`st.matrix()` |
passes a matrix to R |

`st.data()` |
passes a Stata data set to R |

`load.data()` |
loads R dataframe in Stata |

Below, I demonstrate how to use thise functions.

`st.scalar()`

function. scalar a = 999

. R: (a <- st.scalar(a))

```
[1] 999
```

. scalar a = "String Scalar"

. R: (a <- st.scalar(a))

```
[1] "String Scalar"
```

`st.matrix()`

functionas shown in the example below, you can pass your Stata matrices to R, do any manipulation, and automatically get the resulting matrix back in Stata

. matrix A = (1,2\3,4)

. matrix B = (96,96\96,96)

. R: C <- st.matrix(A) + st.matrix(B)

. R: C

```
[,1] [,2]
[1,] 97 98
[2,] 99 100
```

. mat list r(C) //Matrix C in Stata

```
r(C)[2,2]
c1 c2
r1 97 98
r2 99 100
```

`st.data()`

functionFinally, you can also pass Stata data set to R. If the data set is on your machine, you should provide the relative or absolute path to the file name. For example, the absolute path to the **auto.dta** on my machine is:

. R: mydata <- st.data(/Applications/Stata/ado/base/a/auto.dta)

. R: head(mydata)

```
make price mpg rep78 headroom trunk weight length turn displacement
1 AMC Concord 4099 22 3 2.5 11 2930 186 40 121
2 AMC Pacer 4749 17 3 3.0 11 3350 173 40 258
3 AMC Spirit 3799 22 NA 3.0 12 2640 168 35 121
4 Buick Century 4816 20 3 4.5 16 3250 196 40 196
5 Buick Electra 7827 15 4 4.0 20 4080 222 43 350
6 Buick LeSabre 5788 18 3 4.0 21 3670 218 43 231
gear_ratio foreign
1 3.58 Domestic
2 2.53 Domestic
3 3.08 Domestic
4 2.93 Domestic
5 2.41 Domestic
6 2.73 Domestic
```

if you leave the ** st.data()** function empty, it passes the loaded data set from Stata to R. for example:

. sysuse auto, clear

```
(1978 Automobile Data)
```

. keep price mpg

. R: mydata <- st.data()

. R: head(mydata)

```
price mpg
1 4099 22
2 4749 17
3 3799 22
4 4816 20
5 7827 15
6 5788 18
```

`load.data()`

functionYou can also load a dataframe from R to Stata. This will clear any data you have loaded in Stata automatically, so becareful with that! Nevertheless, the function can be very useful to quickly pass data frame from R to Stata. This function will export a Stata version 11 data set using the **foreign** R package and load it in Stata:

. clear

. R: mydata <- data.frame(cars)

. R: load.data(mydata)

The **mydata** data frame is already loaded in Stata! You can just follow your analysis in Stata now!

. list in 1/2

```
+--------------+
| speed dist |
|--------------|
1. | 4 2 |
2. | 4 10 |
+--------------+
```

It's your turn to test the package, Fork It on GitHub and contribute to it. Connecting R to Stata in such a level of integrity, can really ease the process of running a computation in R and passing the results or variables back to Stata.

The package requires **R** to be installed on the machine.
The package detects R in the default paths based on the operating system.
The easiest way to see if R is accessible is to execute a command in R

. R: print("Hello World")

` [1] "Hello World" `

If R is not accessible, you can also permanently
setup the path to R using the

. Rcall "{it:/usr/bin/r}"

When you work with

. R: rm(list=ls())

. R: unlink(".RData")

However, the commands above do not erase the attached packages and data sets.
you can view the attached objects in your R environment using the *name*"

. R:

```
------------------------------------------------- R (type
```**end** to exit) ----------
. attach(cars)
. library(Rcpp) # make sure you have it installed
. search() # Output is omitted ...
.
. detach(cars)
. detach("package:Rcpp")
---------------------------------------------------------------------------------------