深度学习与支持向量机有什么联系？

90年代初，我和Vapnik一起在贝尔实验室共事，在此期间相继提出了一些后来有影响力的算法：卷积神经网络，支持向量机，切线距离等。1995年，AT&T从朗讯科技公司（LUCENT）独立出来，我则出任了AT&T实验室图像处理研究组的负责人，组内机器学习相关的研究员包括：Yoshua Bengio, Leon Bottou, and Patrick Haffner, and Vladimir Vapnik，访问学者和实习生主要包括：Bernhard SchÃ¶lkopf, Jason Weston, Olivier Chapelle。

我和Vapnik经常一起深入讨论（深度）神经网络和核方法（kernel machines）的优缺点。简单来讲，我一直对学习特征表示很感兴趣，对核方法并不十分感冒，因为它对我想解决的问题没有直接的帮助。事实上，支持向量机是一个具有很好数学基础的分类方法，但它本质上也只不过是一个简单的两层方法：第一层可以看作是一些单元集合（一个支持向量就是一个单元），这些单元通过核函数能够度量输入向量和每个支持向量的相似度；第二层则把这些相似度做了简单的线性累加。支持向量机第一层的训练和最简单的无监督学习基本一致：利用支持向量来表示训练样本。一般来讲，通过调整核函数的平滑性（参数）能在线性分类和模板匹配之间做出平衡。从这个角度来讲，核函数只不过是一种模板匹配方法，我也因此在大约10年前就意识到了其局限性。另一方面，Vapnik 则认为支持向量机能方便地进行泛化控制。一个用“窄”核函数的支持向量机能很好地学习训练集，但它的泛化能力则要诉诸于核的宽度和对偶系数的稀疏度。Vapnik非常在意算法的误差界，因此他比较担忧神经网络乏善可陈的泛化控制方法（即使可以从VC维来解释其泛化界）。

而我则认为，是否能进行有效的泛化在一定程度上并不是最重要的，实际应用中我们往往更在乎通过有限的运算可以更高效地计算更复杂的函数。例如，在像素层次上运用核函数进行具有平移、尺度、旋转、不同光照以及混乱背景不变性的图像识别几乎是不可能的。但是深度学习（比如卷积神经网络）则能很容易地处理这些问题。

注：本文是翻译燕乐存博士的采访稿，原文如下：

GP: 3. You and I have met a while ago at a scientific advisory meeting of KXEN, whereVapnik‘s Statistical Learning Theory and SVM were a major topic. What is the relationship between Deep Learning and Support Vector Machines / Statistical Learning Theory?

Yann LeCun: Vapnik and I were in nearby office at Bell Labs in the early 1990s, in Larry Jackel’s Adaptive Systems Research Department. Convolutional nets, Support Vector Machines, Tangent Distance, and several other influential methods were invented within a few meters of each other, and within a few years of each other. When AT&T spun off Lucent In 1995, I became the head of that department which became the Image Processing Research Department at AT&T Labs – Research. Machine Learning members included Yoshua Bengio, Leon Bottou, and Patrick Haffner, and Vladimir Vapnik. Visitors and interns included Bernhard SchÃ¶lkopf, Jason Weston, Olivier Chapelle, and others.

Vapnik and I often had lively discussions about the relative merits of (deep) neural nets and kernel machines. Basically, I have always been interested in solving the problem of learning features or learning representations. I had only a moderate interest in kernel methods because they did nothing to address this problem. Naturally, SVMs are wonderful as a generic classification method with beautiful math behind them. But in the end, they are nothing more than simple two-layer systems. The first layer can be seen as a set of units (one per support vector) that measure a kind of similarity between the input vector and each support vector using the kernel function. The second layer linearly combines these similarities.

It’s a two-layer system in which the first layer is trained with the simplest of all unsupervised learning method: simply store the training samples as prototypes in the units. Basically, varying the smoothness of the kernel function allows us to interpolate between two simple methods: linear classification, and template matching. I got in trouble about 10 years ago by saying that kernel methods were a form of glorified template matching. Vapnik, on the other hand, argued that SVMs had a very clear way of doing capacity control. An SVM with a “narrow” kernel function can always learn the training set perfectly, but its generalization error is controlled by the width of the kernel and the sparsity of the dual coefficients. Vapnik really believes in his bounds. He worried that neural nets didn’t have similarly good ways to do capacity control (although neural nets do have generalization bounds, since they have finite VC dimension).

My counter argument was that the ability to do capacity control was somewhat secondary to the ability to compute highly complex function with a limited amount of computation. Performing image recognition with invariance to shifts, scale, rotation, lighting conditions, and background clutter was impossible (or extremely inefficient) for a kernel machine operating at the pixel level. But it was quite easy for deep architectures such as convolutional nets.

支持向量机算法（SVM）

目的：超平面将样本分为两份，超平面由向量 w 和位移 b 确定。距离超平面最近的点（向量）称为支持向量，该超平面满足支持向量到超平面的距离之和最小。 sk-learn 实现： import os import pandas as pd from sklearn import svm from sklearn.neur ..

SVM 简介

SVM 简介支持向量机(Support Vector Machine)是 Cortes 和 Vapnik 于 1995 年首先提出的，它在解决小样本、非线性及高维模式识别中表现出许多特有的优势，并能够推广应用到函数拟合等其他机器学习问题中[10]。支持向量机方法是建立在统计学习理论的 VC 维理论和结构风险最小原理 ..

小样本学习 · Few-shot Learning，FSL

待完成的视频：论文解读合集：【CVPR 2021】小样本学习论文解读 | Few-Shot Classification with Feature Map..._哔哩哔哩_bilibili 王树森：Few-Shot Learning (1/3): 基本概念 (youtube.com) 概述描述 FSL旨在解决在训练 ..

机器学习 -KNN 算法原理 && Spark 实现

机器学习-KNN 算法原理 && Spark 实现不懂算法的数据开发者不是一个好的算法工程师，还记得研究生时候，导师讲过的一些数据挖掘算法，颇有兴趣，但是无奈工作后接触少了，数据工程师的鄙视链，模型 > 实时 > 离线数仓 >ETL 工程师 >BI 工程师（不喜勿喷哈），现在做 ..

机器学习算法之 KMeans 聚类算法

一、K-Means 聚类算法原理 1. 算法思想物以类聚，人以群分。 K-Means：一种常见的无监督学习算法，名字叫做 K 均值算法。是否为监督学习：只需要看输入的数据是否有标签。 K-Means 聚类算法是一种迭代求解的聚类分析算法。算法思想是：我们需要随机选择 K 个对象作为初始的聚类中心，然后计算每个对 ..

欢迎来到这里！

我们正在构建一个小众社区，大家在这里相互信任，以平等 • 自由 • 奔放的价值观进行分享交流。最终，希望大家能够找到与自己志同道合的伙伴，共同成长。

关于

深度学习与支持向量机有什么联系？

相关帖子

支持向量机算法（SVM）

SVM 简介

小样本学习 · Few-shot Learning，FSL

机器学习 -KNN 算法原理 && Spark 实现

机器学习算法之 KMeans 聚类算法

矩池云上安装 NVCaffe 教程

矩池云上 nvidia opencl 安装及测试教程

欢迎来到这里！

近期热议

推荐标签标签

最新标签

深度学习与支持向量机有什么联系？

相关帖子

支持向量机算法（SVM）

SVM 简介

小样本学习 · Few-shot Learning，FSL

机器学习 -KNN 算法原理 && Spark 实现

机器学习算法之 KMeans 聚类算法

矩池云上安装 NVCaffe 教程

矩池云上 nvidia opencl 安装及测试教程

欢迎来到这里！

近期热议

推荐标签 标签

最新标签

推荐标签标签