python 기초
sklearn standard scaler
난개발자
2024. 8. 27. 21:19
728x90
데이터 표준화 적용시, sklearn의 standard scaler를 사용하는 경우, ddof=0의 편향 표준편차 를 사용한다.
https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html
StandardScaler
Gallery examples: Release Highlights for scikit-learn 1.5 Release Highlights for scikit-learn 1.4 Release Highlights for scikit-learn 1.2 Release Highlights for scikit-learn 1.1 Release Highlights ...
scikit-learn.org
아래와 같이 샘플을 만들어 Standard Scaler를 적용하면, ddof=0의 std를 사용한 계산과 결과가 같음을 알 수 있다.
In [2]:
import numpy as np
import pandas as pd
import sklearn
In [3]:
A=np.random.sample(10)*5+2
B=np.random.sample(10)*2+3
C=np.random.sample(10)*3-1
df=pd.DataFrame({'A':A, 'B':B, 'C':C})
df
Out[3]:
A | B | C | |
---|---|---|---|
0 | 3.609154 | 3.408440 | 1.151811 |
1 | 6.227276 | 3.419386 | 0.669558 |
2 | 6.363483 | 4.949257 | 0.655796 |
3 | 4.491161 | 3.223420 | -0.369682 |
4 | 2.144723 | 4.559320 | 1.054213 |
5 | 2.170273 | 4.978961 | 1.713408 |
6 | 5.584039 | 3.042569 | 0.886624 |
7 | 5.224212 | 4.663551 | -0.564353 |
8 | 4.030533 | 3.552307 | 1.247374 |
9 | 5.134493 | 3.350118 | 1.917439 |
In [4]:
from sklearn.preprocessing import StandardScaler
std_scaler=StandardScaler()
print('sklearn standard scaler\n',std_scaler.fit_transform(df))
print('\nddof=1\n',(df-df.mean())/df.std(ddof=1))
print('\nddof=0\n',(df-df.mean())/df.std(ddof=0))
sklearn standard scaler
[[-0.62003685 -0.69093336 0.41679539]
[ 1.20643463 -0.67599531 -0.22010499]
[ 1.30145605 1.41180608 -0.2382801 ]
[-0.00472555 -0.94342769 -1.59260461]
[-1.64166337 0.87966226 0.28790019]
[-1.62383922 1.45234197 1.15848269]
[ 0.75769539 -1.19023447 0.06656856]
[ 0.50667029 1.02190528 -1.84970211]
[-0.32607169 -0.49459958 0.54300297]
[ 0.44408032 -0.7705252 1.42794202]]
ddof=1
A B C
0 -0.588219 -0.655477 0.395407
1 1.144524 -0.641305 -0.208810
2 1.234670 1.339357 -0.226052
3 -0.004483 -0.895014 -1.510877
4 -1.557419 0.834521 0.273126
5 -1.540509 1.377813 1.099033
6 0.718813 -1.129156 0.063152
7 0.480670 0.969464 -1.754781
8 -0.309339 -0.469218 0.515138
9 0.421292 -0.730984 1.354665
ddof=0
A B C
0 -0.620037 -0.690933 0.416795
1 1.206435 -0.675995 -0.220105
2 1.301456 1.411806 -0.238280
3 -0.004726 -0.943428 -1.592605
4 -1.641663 0.879662 0.287900
5 -1.623839 1.452342 1.158483
6 0.757695 -1.190234 0.066569
7 0.506670 1.021905 -1.849702
8 -0.326072 -0.494600 0.543003
9 0.444080 -0.770525 1.427942
728x90