支持向量机(SVM)

假设

假设给定一个特征空间上的训练数据集:

T=\left { \left ( x_{1},y_{1} \right ),\left ( x_{2},y_{2} \right )… \left (x_{N},y_{N} \right) \right }, x_{i}\in R^{n}

T=\left { \left ( x_{1},y_{1} \right ),\left ( x_{2},y_{2} \right )… \left (x_{N},y_{N} \right) \right }, x_{i}\in R^{n}

其中xiRn,yi+1,1,i=1,2,N

  1. xi为第i个实例,N>1, xi为向量

  2. yixi的类标记,即:当yi=+1时, 称为xi正例;当yi=1时, 称为xi负例。

  3. (xi,yi) 称为样本点

线性可分支持向量机

给定线性可分训练数据集,通过间隔最大化得到的分离超平面为

ωTΦ(x)+b=0

相应的分类决策函数f(x)=sign(ωTΦ(x)+b), 称为线性可分支持向量机。Φ(x)是某个确定的特征空间转换函数,将x映射到更高维度。

推导目标函数

y(x)=ωTΦ(x)+b, 有:yiy(x)>0,对参数进行等比例缩放,从而:

yiy(x)|ω|=yi(ωTΦ(xi)+b)|ω|

目标函数:

max\left { \frac{1}{\left | \omega \right |} min\left [ y_{i}\cdot\left ( \omega^{T}\cdot\Phi \left ( x_{i} \right )+b \right ) \right ]\right }

约束条件:

yi(ωTΦ(xi)+b)1

通过等比例缩放的方法,将两类函数值的绝对值满足大于等于1,则新目标函数:

max1|ω|

为计算需要,转换得:

min12|ω|2

拉格朗日乘子法求解

$$L\left ( \omega ,b,\alpha \right ) = \frac{1}{2}\left | \omega \right |^{2}-\sum_{i=1}^{n} \alpha {i}\left ( y{i}\cdot\left ( \omega^{T}\cdot\Phi \left ( x_{i} \right )+b \right)-1 \right )$$

原问题的极小极大问题,其对偶问题是极大极小问题:

minmaxL(ω,b,α)maxminL(ω,b,α)

将拉格朗日函数L分别对w,b求偏导:

$$\frac{\partial L}{\partial \omega }=0 \Rightarrow \omega =\sum_{i=1}^{n}\alpha {i}y{i}\Phi \left ( x_{i} \right )$$

$$\frac{\partial L}{\partial b}=0 \Rightarrow 0 =\sum_{i=1}^{n}\alpha {i}y{i}$$

代入拉格朗日函数,推导得:

\alpha ^{*} = max\left { \sum_{i=1}^{n}\alpha_{i} - \frac{1}{2}\sum_{i,j=1}^{n}\alpha_{i}\alpha_{j}y_{i}y_{j}\Phi^{T}\left ( x_{i} \right ) \Phi \left ( x_{j} \right ) \right }

即:

min\left { \frac{1}{2}\sum_{i=1}^{n}\sum_{j=1}^{n}\alpha_{i}\alpha_{j}y_{i}y_{j}\left(\Phi\left ( x_{i} \right ) \Phi \left ( x_{j} \right )\right) -\sum_{i=1}^{n}\alpha_{i} \right }

ni=1αiyi=0,αi1,i=1,2,,n

线性可分支持向量机学习算法

  1. 计算:

$$\omega^{}=\sum_{i=1}^{n}\alpha_{i}^{}y_{i}\Phi\left( x_{i} \right)$$

$$b^{}=y_{i} - \sum_{i=1}^{n}\alpha_{i}^{}y_{i}\left(\Phi\left ( x_{i} \right ) \cdot \Phi \left ( x_{j} \right )\right) $$

  1. 求的超平面

$$\omega^{}\cdot\Phi \left ( x \right )+b^{}=0$$

  1. 分类决策函数

$$f\left ( x \right ) = sign\left ( \omega^{}\cdot\Phi \left ( x \right )+b^{} \right )$$

线性支持向量机

若数据线性不可分,则增加松弛因子ξi0,使函数间隔加上松弛度量大于等于1,即约束条件变为:

yi(ωxi+b)1ξi

线性svm的目标函数:

min12|ω|2+CNi=1ξi,ξi0,i=1,2,,n

带松弛因子的SVM拉格朗日函数:

$$L\left(\omega,b,\xi,\alpha,\mu\right)\equiv \frac{1}{2}|\omega|^2 + C\sum_{i=1}^{n}\xi_{i} - \sum_{i=1}^{n} \alpha {i}\left ( y{i}\left ( \omega\cdot\Phi \left ( x_{i} \right )+b \right)-1 +\xi_{i}\right ) - \sum_{i=1}^{n}\mu_{i}\xi_{i}$$

ω,b,ξ求偏导:

$$\frac{\partial L}{\partial \omega }=0 \Rightarrow \omega =\sum_{i=1}^{n}\alpha {i}y{i}\Phi \left ( x_{i} \right )$$

$$\frac{\partial L}{\partial b}=0 \Rightarrow 0 =\sum_{i=1}^{n}\alpha {i}y{i}$$

Lξi=0Cαiμi=0

代入目标函数,得:


minL(ω,b,ξ,α,μ)=12ni=1nj=1αiαjyiyjΦT(xi)Φ(xj)+ni=1αi

对上式求关于α得极大,得到:

max12ni=1nj=1αiαjyiyjΦT(xi)Φ(xj)+ni=1αi

ni=1αiyi=0

Cαiμi=0

0αiC

μi0,i=1,2,,n

对偶问题:

min12ni=1nj=1αiαjyiyjΦT(xi)Φ(xj)ni=1αi

线性支持向量机学习算法

  1. 计算:

$$\omega^{}=\sum_{i=1}^{n}\alpha_{i}^{}y_{i}\Phi\left( x_{i} \right)$$

$$b^{}=y_{i} - \sum_{i=1}^{n}\alpha_{i}^{}y_{i}\left(\Phi\left ( x_{i} \right ) \cdot \Phi \left ( x_{j} \right )\right) $$

计算时,需要满足条件0<αj<C的向量,实践中往往取支持向量的所有值取平均,作为b

  1. 求的超平面

$$\omega^{}\cdot\Phi \left ( x \right )+b^{}=0$$

  1. 分类决策函数

$$f\left ( x \right ) = sign\left ( \omega^{}\cdot\Phi \left ( x \right )+b^{} \right )$$

非线性支持向量机

核函数

可以使用和函数,将原始输入空间映射到新的特征空间,从而使得原本线性不可分的样本可能在核空间可分。

  1. 多项式和函数: κ(x1,x2)=(x1x2+c)d

  2. 高斯核RBF函数:κ(x1,x2)=exp(γ|x1x2|2)

  3. Sigmoid核函数:κ(x1,x2)=tanh(x1x2+c)

在实际应用中,往往依赖先验领域知识或交叉验证等方案才能选择有效的核函数,如果没有更多先验信息,则使用高斯核函数。

实践1-鸢尾花分类

1.python实现鸢尾花分类代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# -*- coding: utf-8 -*
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn import svm
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np

iris = datasets.load_iris()
x = iris.data
y = iris.target

x_train, x_test, y_train, y_test = train_test_split(x, y, random_state=1, test_size=0.6)

clf = svm.SVC(C=0.1, kernel='linear', decision_function_shape='ovr')
clf.fit(x_train, y_train.ravel())

print clf.score(x_train, y_train) # 精度
print '训练集准确率:', accuracy_score(y_train, clf.predict(x_train))
print clf.score(x_test, y_test)
print '测试集准确率:', accuracy_score(y_test, clf.predict(x_test))

2.鸢尾花分类结果

1
2
训练集准确率: 0.966666666667
测试集准确率: 0.944444444444

实践2-手写体

1.python实现手写体代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
# -*- coding:utf-8 -*-
import numpy as np
from sklearn import svm
import matplotlib.colors
import matplotlib.pyplot as plt
from PIL import Image
from sklearn.metrics import accuracy_score
import os
from sklearn.model_selection import GridSearchCV
from time import time


def show_accuracy(a, b, tip):
acc = a.ravel() == b.ravel()
print tip + '正确率:%.2f%%' % (100*np.mean(acc))


def save_image(im, i):
im *= 15.9375
im = 255 - im
a = im.astype(np.uint8)
output_path = '.\\HandWritten'
if not os.path.exists(output_path):
os.mkdir(output_path)
Image.fromarray(a).save(output_path + ('\\%d.png' % i))


if __name__ == "__main__":
print 'Load Training File Start...'
data = np.loadtxt('./data/optdigits.tra', dtype=np.float, delimiter=',')
x, y = np.split(data, (-1, ), axis=1)
images = x.reshape(-1, 8, 8)
y = y.ravel().astype(np.int)

print 'Load Test Data Start...'
data = np.loadtxt('./data/optdigits.tes', dtype=np.float, delimiter=',')
x_test, y_test = np.split(data, (-1, ), axis=1)
print y_test.shape
images_test = x_test.reshape(-1, 8, 8)
y_test = y_test.ravel().astype(np.int)
print 'Load Data OK...'

# x, x_test, y, y_test = train_test_split(x, y, test_size=0.4, random_state=1)
# images = x.reshape(-1, 8, 8)
# images_test = x_test.reshape(-1, 8, 8)

matplotlib.rcParams['font.sans-serif'] = [u'SimHei']
matplotlib.rcParams['axes.unicode_minus'] = False
plt.figure(figsize=(15, 9), facecolor='w')
for index, image in enumerate(images[:16]):
plt.subplot(4, 8, index + 1)
plt.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')
plt.title(u'训练图片: %i' % y[index])
for index, image in enumerate(images_test[:16]):
plt.subplot(4, 8, index + 17)
plt.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')
save_image(image.copy(), index)
plt.title(u'测试图片: %i' % y_test[index])
plt.tight_layout()
plt.show()

# params = {'C':np.logspace(0, 3, 7), 'gamma':np.logspace(-5, 0, 11)}
# model = GridSearchCV(svm.SVC(kernel='rbf'), param_grid=params, cv=3)
model = svm.SVC(C=10, kernel='rbf', gamma=0.001)
print 'Start Learning...'
t0 = time()
model.fit(x, y)
t1 = time()
t = t1 - t0
print '训练+CV耗时:%d分钟%.3f秒' % (int(t/60), t - 60*int(t/60))
# print '最优参数:\t', model.best_params_
#clf.fit(x, y)
print 'Learning is OK...'
print '训练集准确率:', accuracy_score(y, model.predict(x))
y_hat = model.predict(x_test)
print '测试集准确率:', accuracy_score(y_test, model.predict(x_test))
print y_hat
print y_test

err_images = images_test[y_test != y_hat]
err_y_hat = y_hat[y_test != y_hat]
err_y = y_test[y_test != y_hat]
print err_y_hat
print err_y
plt.figure(figsize=(10, 8), facecolor='w')
for index, image in enumerate(err_images):
if index >= 12:
break
plt.subplot(3, 4, index + 1)
plt.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')
plt.title(u'错分为:%i,真实值:%i' % (err_y_hat[index], err_y[index]))
plt.tight_layout()
plt.show()

2.实验结果

1
2
3
4
5
6
7
8
9
Load Training File Start...
Load Test Data Start...
(1797L, 1L)
Load Data OK...
Start Learning...
训练+CV耗时:0分钟0.632
Learning is OK...
训练集准确率: 1.0
测试集准确率: 0.982749026155
感谢您的阅读,本文由 Gavinhome Blog 版权所有。如若转载,请注明出处:Gavinhome Blog(http://gavinhome.github.io/2018/01/27/支持向量机/
Logistic回归
决策树与随机森林
0 comments
Anonymous
Markdown is supported

Be the first person to leave a comment!