【理论】支持向量机2: Support Vector 介绍支持向量机目标
【原文:http://blog.pluskid.org/?p=682】 上一次介绍支持向量机,结果说到 Maximum Margin Classifier ,到最后都没有说“支持向量”到底是什么东西。不妨回忆一下上次最后一张图: 可以看到两个支撑着中间的 gap 的超平面,它们到中间的 separating hyper
【原文:http://blog.pluskid.org/?p=682】
上一次介绍支持向量机,结果说到 Maximum Margin Classifier ,到最后都没有说“支持向量”到底是什么东西。不妨回忆一下上次最后一张图:
可以看到两个支撑着中间的 gap 的超平面,它们到中间的 separating hyper plane 的距离相等(想想看:为什么一定是相等的?),即我们所能得到的最大的 geometrical margin
很显然,由于这些 supporting vector 刚好在边界上,所以它们是满足
当然,除了从几何直观上之外,支持向量的概念也会从其优化过程的推导中得到。其实上一次还偷偷卖了另一个关子就是虽然给出了目标函数,却没有讲怎么来求解。现在就让我们来处理这个问题。回忆一下之前得到的目标函数:
这个问题等价于(为了方便求解,我在这里加上了平方,还有一个系数,显然这两个问题是等价的,因为我们关心的并不是最优情况下目标函数的具体数值):
到这个形式以后,就可以很明显地看出来,它是一个凸优化问题,或者更具体地说,它是一个二次优化问题——目标函数是二次的,约束条件是线性的。这个问题可以用任何现成的 QP (Quadratic Programming) 的优化包进行求解。所以,我们的问题到此为止就算全部解决了,于是我睡午觉去了~
啊?呃,有人说我偷懒不负责任了?好吧,嗯,其实呢,虽然这个问题确实是一个标准的 QP 问题,但是它也有它的特殊结构,通过 Lagrange Duality 变换到对偶变量 (dual variable) 的优化问题之后,可以找到一种更加有效的方法来进行求解——这也是 SVM 盛行的一大原因,通常情况下这种方法比直接使用通用的 QP 优化包进行优化要高效得多。此外,在推导过程中,许多有趣的特征也会被揭露出来,包括刚才提到的 supporting vector 的问题。
关于 Lagrange duality 我没有办法在这里细讲了,可以参考 Wikipedia 。简单地来说,通过给每一个约束条件加上一个 Lagrange multiplier,我们可以将它们融和到目标函数里去
然后我们令
容易验证,当某个约束条件不满足时,例如
这里用
当然,交换以后的问题不再等价于原问题,这个新问题的最优值用 总之,第二个问题的最优值
首先要让
带回
此时我们得到关于 dual variable
如前面所说,这个问题有更加高效的优化算法,不过具体方法在这里先不介绍,让我们先来看看推导过程中得到的一些有趣的形式。首先就是关于我们的 hyper plane ,对于一个数据点
这里的形式的有趣之处在于,对于新点
为什么非支持向量对应的
注意到如果
嗯,于是呢,把所有的这些东西整合起来,得到的一个 maximum margin hyper plane classifier 就是支持向量机(Support Vector Machine),经过直观的感觉和数学上的推导,为什么叫“支持向量”,应该也就明了了吧?当然,到目前为止,我们的 SVM 还比较弱,只能处理线性的情况,不过,在得到了 dual 形式之后,通过 Kernel 推广到非线性的情况就变成了一件非常容易的事情了。不过,具体细节,还要留到下一次再细说了。

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics











In modern manufacturing, accurate defect detection is not only the key to ensuring product quality, but also the core of improving production efficiency. However, existing defect detection datasets often lack the accuracy and semantic richness required for practical applications, resulting in models unable to identify specific defect categories or locations. In order to solve this problem, a top research team composed of Hong Kong University of Science and Technology Guangzhou and Simou Technology innovatively developed the "DefectSpectrum" data set, which provides detailed and semantically rich large-scale annotation of industrial defects. As shown in Table 1, compared with other industrial data sets, the "DefectSpectrum" data set provides the most defect annotations (5438 defect samples) and the most detailed defect classification (125 defect categories

Editor |KX To this day, the structural detail and precision determined by crystallography, from simple metals to large membrane proteins, are unmatched by any other method. However, the biggest challenge, the so-called phase problem, remains retrieving phase information from experimentally determined amplitudes. Researchers at the University of Copenhagen in Denmark have developed a deep learning method called PhAI to solve crystal phase problems. A deep learning neural network trained using millions of artificial crystal structures and their corresponding synthetic diffraction data can generate accurate electron density maps. The study shows that this deep learning-based ab initio structural solution method can solve the phase problem at a resolution of only 2 Angstroms, which is equivalent to only 10% to 20% of the data available at atomic resolution, while traditional ab initio Calculation

The open LLM community is an era when a hundred flowers bloom and compete. You can see Llama-3-70B-Instruct, QWen2-72B-Instruct, Nemotron-4-340B-Instruct, Mixtral-8x22BInstruct-v0.1 and many other excellent performers. Model. However, compared with proprietary large models represented by GPT-4-Turbo, open models still have significant gaps in many fields. In addition to general models, some open models that specialize in key areas have been developed, such as DeepSeek-Coder-V2 for programming and mathematics, and InternVL for visual-language tasks.

For AI, Mathematical Olympiad is no longer a problem. On Thursday, Google DeepMind's artificial intelligence completed a feat: using AI to solve the real question of this year's International Mathematical Olympiad IMO, and it was just one step away from winning the gold medal. The IMO competition that just ended last week had six questions involving algebra, combinatorics, geometry and number theory. The hybrid AI system proposed by Google got four questions right and scored 28 points, reaching the silver medal level. Earlier this month, UCLA tenured professor Terence Tao had just promoted the AI Mathematical Olympiad (AIMO Progress Award) with a million-dollar prize. Unexpectedly, the level of AI problem solving had improved to this level before July. Do the questions simultaneously on IMO. The most difficult thing to do correctly is IMO, which has the longest history, the largest scale, and the most negative

In 2023, almost every field of AI is evolving at an unprecedented speed. At the same time, AI is constantly pushing the technological boundaries of key tracks such as embodied intelligence and autonomous driving. Under the multi-modal trend, will the situation of Transformer as the mainstream architecture of AI large models be shaken? Why has exploring large models based on MoE (Mixed of Experts) architecture become a new trend in the industry? Can Large Vision Models (LVM) become a new breakthrough in general vision? ...From the 2023 PRO member newsletter of this site released in the past six months, we have selected 10 special interpretations that provide in-depth analysis of technological trends and industrial changes in the above fields to help you achieve your goals in the new year. be prepared. This interpretation comes from Week50 2023

Editor |ScienceAI Question Answering (QA) data set plays a vital role in promoting natural language processing (NLP) research. High-quality QA data sets can not only be used to fine-tune models, but also effectively evaluate the capabilities of large language models (LLM), especially the ability to understand and reason about scientific knowledge. Although there are currently many scientific QA data sets covering medicine, chemistry, biology and other fields, these data sets still have some shortcomings. First, the data form is relatively simple, most of which are multiple-choice questions. They are easy to evaluate, but limit the model's answer selection range and cannot fully test the model's ability to answer scientific questions. In contrast, open-ended Q&A

Editor | KX Retrosynthesis is a critical task in drug discovery and organic synthesis, and AI is increasingly used to speed up the process. Existing AI methods have unsatisfactory performance and limited diversity. In practice, chemical reactions often cause local molecular changes, with considerable overlap between reactants and products. Inspired by this, Hou Tingjun's team at Zhejiang University proposed to redefine single-step retrosynthetic prediction as a molecular string editing task, iteratively refining the target molecular string to generate precursor compounds. And an editing-based retrosynthetic model EditRetro is proposed, which can achieve high-quality and diverse predictions. Extensive experiments show that the model achieves excellent performance on the standard benchmark data set USPTO-50 K, with a top-1 accuracy of 60.8%.

Editor | ScienceAI Based on limited clinical data, hundreds of medical algorithms have been approved. Scientists are debating who should test the tools and how best to do so. Devin Singh witnessed a pediatric patient in the emergency room suffer cardiac arrest while waiting for treatment for a long time, which prompted him to explore the application of AI to shorten wait times. Using triage data from SickKids emergency rooms, Singh and colleagues built a series of AI models that provide potential diagnoses and recommend tests. One study showed that these models can speed up doctor visits by 22.3%, speeding up the processing of results by nearly 3 hours per patient requiring a medical test. However, the success of artificial intelligence algorithms in research only verifies this
