2015. 9. 26. 20:58

'Machine Learning & Data Mining' 카테고리의 다른 글

데이터마이닝 02-추천시스템 만들기 from Kwang Woo NAM (0)	2015.09.27
데이터마이닝 01-데이터마이닝 개요 from Kwang Woo NAM (0)	2015.09.27
Deep Learning Tutorial (펌) (0)	2015.09.26
Deep Learning At NAVER from NAVER D2 (0)	2015.09.26
Large scale deep-learning_on_spark (펌) (0)	2015.09.26

Posted by Name_null

2015. 9. 26. 20:50

Machine Learning & Data Mining

Deep Learning Tutorial (펌)

http://khanrc.tistory.com/entry/Deep-Learning-Tutorial

2015/05/24 17:12

Posted in DataScience/Deep Learning by khanrc

A Deep Learning Tutorial: From Perceptrons to Deep Networks

위 링크의 요약번역.

딥러닝의 키 컨셉과 알고리즘에 대해 살펴본다. 여기선 다 뺐지만, 본문에서는 자바 코드도 소개하고 있다. 내가 잘 모르는 부분일수록 본문 그대로 번역하기 때문에 앞부분이 요약이 많고 뒤로 갈수록 본문 내용을 다 포함한다.

Perceptrons: 초기 딥러닝 알고리즘

저작자표시 비영리

'Machine Learning & Data Mining' 카테고리의 다른 글

데이터마이닝 01-데이터마이닝 개요 from Kwang Woo NAM (0)	2015.09.27
Artificial Intelligence and Intelligence Business (펌) (0)	2015.09.26
Deep Learning At NAVER from NAVER D2 (0)	2015.09.26
Large scale deep-learning_on_spark (펌) (0)	2015.09.26
Essentials of Machine Learning Algorithms #6.Dimensionality Reduction Algorithms and Gradient Boosting & AdaBoost (0)	2015.09.26

Posted by Name_null

2015. 9. 26. 20:36

Machine Learning & Data Mining

Deep Learning At NAVER from NAVER D2

[2A4]DeepLearningAtNAVER from NAVER D2

저작자표시 비영리

'Machine Learning & Data Mining' 카테고리의 다른 글

Artificial Intelligence and Intelligence Business (펌) (0)	2015.09.26
Deep Learning Tutorial (펌) (0)	2015.09.26
Large scale deep-learning_on_spark (펌) (0)	2015.09.26
Essentials of Machine Learning Algorithms #6.Dimensionality Reduction Algorithms and Gradient Boosting & AdaBoost (0)	2015.09.26
Essentials of Machine Learning Algorithms #5. K-Means, Random Forest (0)	2015.09.26

Posted by Name_null

2015. 9. 26. 20:01

Machine Learning & Data Mining

Large scale deep-learning_on_spark (펌)

[264] large scale deep-learning_on_spark from NAVER D2

저작자표시 비영리

'Machine Learning & Data Mining' 카테고리의 다른 글

Deep Learning Tutorial (펌) (0)	2015.09.26
Deep Learning At NAVER from NAVER D2 (0)	2015.09.26
Essentials of Machine Learning Algorithms #6.Dimensionality Reduction Algorithms and Gradient Boosting & AdaBoost (0)	2015.09.26
Essentials of Machine Learning Algorithms #5. K-Means, Random Forest (0)	2015.09.26
Essentials of Machine Learning Algorithms #4. Navie Bayes, KNN (0)	2015.09.26

Posted by Name_null

2015. 9. 26. 15:06

Machine Learning & Data Mining

Essentials of Machine Learning Algorithms #6.Dimensionality Reduction Algorithms and Gradient Boosting & AdaBoost

9. Dimensionality Reduction Algorithms 차원축소 알고리즘

In the last 4-5 years, there has been an exponential increase in data capturing at every possible stages. Corporates/ Government Agencies/ Research organisations are not only coming with new sources but also they are capturing data in great detail.

For example: E-commerce companies are capturing more details about customer like their demographics, web crawling history, what they like or dislike, purchase history, feedback and many others to give them personalized attention more than your nearest grocery shopkeeper.

As a data scientist, the data we are offered also consist of many features, this sounds good for building good robust model but there is a challenge. How’d you identify highly significant variable(s) out 1000 or 2000? In such cases, dimensionality reduction algorithm helps us along with various other algorithms like Decision Tree, Random Forest, PCA, Factor Analysis, Identify based on correlation matrix, missing value ratio and others.

변수가 수천개씩 될때 이를가지고 decision tree난 regression을 하는데에는 문제가 많다. 이때 차원축소 알고리즘이 필요

To know more about this algorithms, you can read “Beginners Guide To Learn Dimension Reduction Techniques“.

Python Code

#Import Library
from sklearn import decomposition
#Assumed you have training and test data set as train and test
# Create PCA obeject pca= decomposition.PCA(n_components=k) #default value of k =min(n_sample, n_features)
# For Factor analysis
#fa= decomposition.FactorAnalysis()
# Reduced the dimension of training dataset using PCA
train_reduced = pca.fit_transform(train)
#Reduced the dimension of test dataset
test_reduced = pca.transform(test)
#For more detail on this, please refer  this link.

R Code

library(stats)
pca <- princomp(train, cor = TRUE)
train_reduced  <- predict(pca,train)
test_reduced  <- predict(pca,test)

10. Gradient Boosting & AdaBoost

GBM & AdaBoost are boosting algorithms used when we deal with plenty of data to make a prediction with high prediction power. Boosting is an ensemble learning algorithm which combines the prediction of several base estimators in order to improve robustness over a single estimator. It combines multiple weak or average predictors to a build strong predictor. These boosting algorithms always work well in data science competitions like Kaggle, AV Hackathon, CrowdAnalytix.

More: Know about Gradient and AdaBoost in detail

Python Code

#Import Library
from sklearn.ensemble import GradientBoostingClassifier
#Assumed you have, X (predictor) and Y (target) for training data set and x_test(predictor) of test_dataset
# Create Gradient Boosting Classifier object
model= GradientBoostingClassifier(n_estimators=100, learning_rate=1.0, max_depth=1, random_state=0)
# Train the model using the training sets and check score
model.fit(X, y)
#Predict Output
predicted= model.predict(x_test)

R Code

library(caret)
x <- cbind(x_train,y_train)
# Fitting model
fitControl <- trainControl( method = "repeatedcv", number = 4, repeats = 4)
fit <- train(y ~ ., data = x, method = "gbm", trControl = fitControl,verbose = FALSE)
predicted= predict(fit,x_test,type= "prob")[,2]

GradientBoostingClassifier and Random Forest are two different boosting tree classifier and often people ask about the difference between these two algorithms.

End Notes

By now, I am sure, you would have an idea of commonly used machine learning algorithms. My sole intention behind writing this article and providing the codes in R and Python is to get you started right away. If you are keen to master machine learning, start right away. Take up problems, develop a physical understanding of the process, apply these codes and see the fun!

Did you find this article useful ? Share your views and opinions in the comments section below.

If you like what you just read & want to continue your analytics learning, subscribe to our emails, follow us on twitter or like our facebook page.

저작자표시 비영리

'Machine Learning & Data Mining' 카테고리의 다른 글

Deep Learning At NAVER from NAVER D2 (0)	2015.09.26
Large scale deep-learning_on_spark (펌) (0)	2015.09.26
Essentials of Machine Learning Algorithms #5. K-Means, Random Forest (0)	2015.09.26
Essentials of Machine Learning Algorithms #4. Navie Bayes, KNN (0)	2015.09.26
Essentials of Machine Learning Algorithms #3. Decision Tree & SVM (0)	2015.09.26

Posted by Name_null

일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

daTa-dRiveN

Artificial Intelligence and Intelligence Business (펌)

'Machine Learning & Data Mining' 카테고리의 다른 글

Deep Learning Tutorial (펌)

A Deep Learning Tutorial: From Perceptrons to Deep Networks

Perceptrons: 초기 딥러닝 알고리즘

'Machine Learning & Data Mining' 카테고리의 다른 글

Deep Learning At NAVER from NAVER D2

'Machine Learning & Data Mining' 카테고리의 다른 글

Large scale deep-learning_on_spark (펌)

'Machine Learning & Data Mining' 카테고리의 다른 글

Essentials of Machine Learning Algorithms #6.Dimensionality Reduction Algorithms and Gradient Boosting & AdaBoost

9. Dimensionality Reduction Algorithms 차원축소 알고리즘

Python Code

R Code

10. Gradient Boosting & AdaBoost

Python Code

R Code

End Notes

If you like what you just read & want to continue your analytics learning, subscribe to our emails, follow us on twitter or like our facebook page.

'Machine Learning & Data Mining' 카테고리의 다른 글

카테고리

태그목록

최근에 올라온 글

공지사항

링크

티스토리툴바

daTa-dRiveN

Artificial Intelligence and Intelligence Business (펌)

'Machine Learning & Data Mining' 카테고리의 다른 글

Deep Learning Tutorial (펌)

A Deep Learning Tutorial: From Perceptrons to Deep Networks

Perceptrons: 초기 딥러닝 알고리즘

'Machine Learning & Data Mining' 카테고리의 다른 글

Deep Learning At NAVER from NAVER D2

'Machine Learning & Data Mining' 카테고리의 다른 글

Large scale deep-learning_on_spark (펌)

'Machine Learning & Data Mining' 카테고리의 다른 글

Essentials of Machine Learning Algorithms #6.Dimensionality Reduction Algorithms and Gradient Boosting & AdaBoost

9. Dimensionality Reduction Algorithms 차원축소 알고리즘

Python Code

R Code

10. Gradient Boosting & AdaBoost

Python Code

R Code

End Notes

If you like what you just read & want to continue your analytics learning, subscribe to our emails, follow us on twitter or like our facebook page.

Share this:

'Machine Learning & Data Mining' 카테고리의 다른 글

카테고리

태그목록

최근에 올라온 글

공지사항

링크

티스토리툴바