Machine Learning in R
What is machine learning? Quote from Wikipedia, "Machine learning is a subfield of computer science that evolved from the study of pattern recognition and computational learning theory in artificial intelligence." In other words, it is using previously given data, to predict future outcomes of current data. This can help us understand correlation between two different events or ideas in the world.
There are multiple types of Machine Learning. There is regression and classification. In both types, we are training a model from given, previous data, and using that model to help us predict outcomes based on the testing data (Data we do not know the outcome to). Regressions is when we are given a numerical input and are expected to find an output. Classification is when we are given a categorical input and are expected to find an output. In this post, we will mainly focus on classification.
We have built ZetaMachina, an implementation for classification prediction in R. This post will not show the full code but please check the GitHub Repo for it.
In this project we created our own
people
data frame as a demo.We create a female and male data frame and then merge it. The whole data frame includes people with three atrributes: their race, hair-color, and gender. The two input categorical variables are race and hair color, while the output variable is the gender another categorical.
female <- air="femaleHair," br="" data.frame="" gender="F" race="femaleRace,">people <- br="" female="" male="" rbind="">count <- 1:1000="" br="">ind <- 800="" count="" replace="FALSE)<br/" sample="">training <- br="" ind="" people="">testing <- br="" ind="" people="">->->->->->->
We have now create our people data frame as well as the training data for our model and our tesing data.
We then create our model using ZetaNaiveBayes. Naive Bayes assume that events are independent.
model <- ace="" air="" c="" code="" ender="" training="" zetanaivebayes="">
Now we create a prediction based on the model and our testing data.
prediction <- br="" model="" testing="" zetapredict="">prediction <- br="" factor="" levels="c(" prediction="">->->
If we print out the result we get our prediction:
#console
[1] M M M M F M M M M M M M M M M M M M M M M F M F M M F M M M F M M M M F M M M M M M M M F M M M M M F M M M M M
[57] M M M M M M M M M M M M M M M F M M M M M M M M M M M M F M F M M M M M M M M M M M M M F M F F F M F F F M M F
[113] F F F F F F M F M M F F F F M F F F F F F M F F M M F F F F F F M F F F M F F F M F F F F F F M F F F F F M F F
[169] F F F F M F F F M F F F M F F M M F F F F M M F F M F M F F F F
Levels: M F
If you enjoyed this project, visit our other projects on our GitHub Page->