Exploratory

Search…

Custom Model Function

Exploratory provides a framework with which users can define and use custom model functions. This is an introduction of how to do it.

Custom Model Function Overview

build_model function defined in exploratory R package and functions in broom package make it easy to use model functions in Exploratory. Steps to use model functions are explained below using **svm** function from e1071 packages as an example.

Model Class Name

First, we need to determine an R class name for the model. It is the class name of the object that is returned by the model building function. In this case, it is "svm". (This is the model class name used by e1071 package we are making use of.)

Model Building Function

Next, we need to define a model building function. Model building function needs to take following arguments.

- 1.formula : Formula that defines what to predict from which predictors.
- 2.data : Training data for building the model.

The model building function needs to return model object with the model class name we determined in the previous step. This means the function definition should look like the following.

1

svm <- function(formula, data, ...) {

2

# Create model object here.

3

4

# Set the model class name of the model object.

5

class(model_object) <- c("svm")

6

model_object

7

}

Copied!

Fortunately, in this case, e1071 package already has such a function, which means we don't need to implement it. Note that the function name in e1071 happens to be "svm", which is same as the model class name, but model building function name and model class name generally does not have to be the same.

Install R Package

Instead of defining model building function, we use e1071's svm function as the model building function. We need to install e1071 package to do so.

There are many R packages with model building functions in this format, which you can make use of as they are, just like e1071.

The package you need may or may not be installed already as part of installation of Exploratory. Packages already installed can be checked from project list view.

You can install R package from install tab. In this case, it's e1071.

Define Functions to Show Model Summary

Exploratory shows summary of model in table format when it's created. Let's define functions to extract the model summary info from the model object as data frames.

You can define those functions in **Scripts**

In svm case, you can define the functions like below. Note that we do not need to define model building function here because we are using e1071's svm function as is, but if this is not the case for a model you are using, you will need to define it here too.

grance function for svm class

1

glance.svm <- function(x, ...){

2

data.frame(

3

total_number_of_support_vector = x$tot.nSV,

4

degree = x$degree,

5

gamma = x$gamma,

6

epsilon = x$epsilon,

7

rho = x$rho

8

)

9

}

Copied!

tidy function for svm class

1

tidy.svm <- function(x, ...){

2

as.data.frame(x$SV)

3

}

Copied!

Here are some explanation on the code above.

**x**is the model object and these functions are expected to return a**data frame**.- Usually, glance returns one row data frame with statistical values and tidy returns data frame with multiple rows, but it is just a conventional rule.
`...`

argument is necessary to avoid errors even if you don't really need extra arguments.

Apply Model Function to a Data Frame

You can now call model function from command line mode. Please open the data frame you want to apply the model function. Here, svm is applied to this data.

You can run command below to this data now.

1

build_model(model_func=e1071::svm, formula = subscribed ~ age + housing + duration, kernel = "linear", test_rate = 0.2, seed = 1)

Copied!

You can see model summary view like this. "Summary of Fit" is the result from **glance** function and "Parameter Estimates" is the result from **tidy** function.

Here, test_rate and seed are arguments for build_model. **test_rate = 0.2** means 20% of the data will be used for test and 80% of the data is used for training. **seed = 1** means 1 is a random seed to split training data and test data by sampling. **formula = subscribed ~ age + housing + duration** and **kernel = "linear"** is a parameter for e1071::svm. You will see summary of the created model like this.

Use the Model for Prediction

You can also use the model for prediction. By defining **augment** function for the models, which is also from broom package. You can define it like this for svm for example.

1

augment.svm <- function(x, data = NULL, newdata = NULL, ...) {

2

if(is.null(newdata)){

3

if(is.null(data)){

4

stop("data or newdata is needed")

5

}

6

data$predicted_value <- x$fitted

7

data

8

} else {

9

predicted <- predict(x, newdata)

10

newdata$predicted_value <- predicted

11

newdata

12

}

13

}

Copied!

The result looks like this.

Examples

Last modified 3mo ago