国产精品投稿成人片不卡,极品一欧美一欧美一区二区三区

當前位置：首頁 > 消費電子 > 消費電子

[導讀]今天，我們介紹機器學習里比較常用的一種分類算法，決策樹。決策樹是對人類認知識別的一種模擬，給你一堆看似雜亂無章的數(shù)據(jù)，如何用盡可能少的特征，對這些數(shù)據(jù)進行有效的

今天，我們介紹機器學習里比較常用的一種分類算法，決策樹。決策樹是對人類認知識別的一種模擬，給你一堆看似雜亂無章的數(shù)據(jù)，如何用盡可能少的特征，對這些數(shù)據(jù)進行有效的分類。

決策樹借助了一種層級分類的概念，每一次都選擇一個區(qū)分性最好的特征進行分類，對于可以直接給出標簽 label 的數(shù)據(jù)，可能最初選擇的幾個特征就能很好地進行區(qū)分，有些數(shù)據(jù)可能需要更多的特征，所以決策樹的深度也就表示了你需要選擇的幾種特征。

在進行特征選擇的時候，常常需要借助信息論的概念，利用最大熵原則。

決策樹一般是用來對離散數(shù)據(jù)進行分類的，對于連續(xù)數(shù)據(jù)，可以事先對其離散化。

在介紹決策樹之前，我們先簡單的介紹一下信息熵，我們知道，熵的定義為：

我們先構造一些簡單的數(shù)據(jù)：

from sklearn import datasets

import numpy as np

import matplotlib.pyplot as plt

import math

import operator

def Create_data():

dataset = [[1, 1, 'yes'],

[1, 1, 'yes'],

[1, 0, 'no'],

[0, 1, 'no'],

[3, 0, 'maybe']]

feat_name = ['no surfacing', 'flippers']

return dataset, feat_name

然后定義一個計算熵的函數(shù)：

def Cal_entrpy(dataset):

n_sample = len(dataset)

n_label = {}

for featvec in dataset:

current_label = featvec[-1]

if current_label not in n_label.keys():

n_label[current_label] = 0

n_label[current_label] += 1

shannonEnt = 0.0

for key in n_label:

prob = float(n_label[key]) / n_sample

shannonEnt -= prob * math.log(prob, 2)

return shannonEnt

要注意的是，熵越大，說明數(shù)據(jù)的類別越分散，越呈現(xiàn)某種無序的狀態(tài)。

下面再定義一個拆分數(shù)據(jù)集的函數(shù)：

def Split_dataset(dataset, axis, value):

retDataSet = []

for featVec in dataset:

if featVec[axis] == value:

reducedFeatVec = featVec[:axis]

reducedFeatVec.extend(featVec[axis+1 :])

retDataSet.append(reducedFeatVec)

return retDataSet

結合前面的幾個函數(shù)，我們可以構造一個特征選擇的函數(shù)：

def Choose_feature(dataset):

num_sample = len(dataset)

num_feature = len(dataset[0]) - 1

baseEntrpy = Cal_entrpy(dataset)

best_Infogain = 0.0

bestFeat = -1

for i in range (num_feature):

featlist = [example[i] for example in dataset]

uniquValus = set(featlist)

newEntrpy = 0.0

for value in uniquValus:

subData = Split_dataset(dataset, i, value)

prob = len(subData) / float(num_sample)

newEntrpy += prob * Cal_entrpy(subData)

info_gain = baseEntrpy - newEntrpy

if (info_gain > best_Infogain):

best_Infogain = info_gain

bestFeat = i

return bestFeat

然后再構造一個投票及計票的函數(shù)

def Major_cnt(classlist):

class_num = {}

for vote in classlist:

if vote not in class_num.keys():

class_num[vote] = 0

class_num[vote] += 1

Sort_K = sorted(class_num.iteritems(),

key = operator.itemgetter(1), reverse=True)

return Sort_K[0][0]

有了這些，就可以構造我們需要的決策樹了：

def Create_tree(dataset, featName):

classlist = [example[-1] for example in dataset]

if classlist.count(classlist[0]) == len(classlist):

return classlist[0]

if len(dataset[0]) == 1:

return Major_cnt(classlist)

bestFeat = Choose_feature(dataset)

bestFeatName = featName[bestFeat]

myTree = {bestFeatName: {}}

del(featName[bestFeat])

featValues = [example[bestFeat] for example in dataset]

uniqueVals = set(featValues)

for value in uniqueVals:

subLabels = featName[:]

myTree[bestFeatName][value] = Create_tree(Split_dataset

(dataset, bestFeat, value), subLabels)

return myTree

def Get_numleafs(myTree):

numLeafs = 0

firstStr = myTree.keys()[0]

secondDict = myTree[firstStr]

for key in secondDict.keys():

if type(secondDict[key]).__name__ == 'dict' :

numLeafs += Get_numleafs(secondDict[key])

else:

numLeafs += 1

return numLeafs

def Get_treedepth(myTree):

max_depth = 0

firstStr = myTree.keys()[0]

secondDict = myTree[firstStr]

for key in secondDict.keys():

if type(secondDict[key]).__name__ == 'dict' :

this_depth = 1 + Get_treedepth(secondDict[key])

else:

this_depth = 1

if this_depth > max_depth:

max_depth = this_depth

return max_depth

我們也可以把決策樹繪制出來:

def Plot_node(nodeTxt, centerPt, parentPt, nodeType):

Create_plot.ax1.annotate(nodeTxt, xy=parentPt,

xycoords='axes fraction',

xytext=centerPt, textcoords='axes fraction',

va="center", ha="center", bbox=nodeType, arrowprops=arrow_args)

def Plot_tree(myTree, parentPt, nodeTxt):

numLeafs = Get_numleafs(myTree)

Get_treedepth(myTree)

firstStr = myTree.keys()[0]

cntrPt = (Plot_tree.xOff + (1.0 + float(numLeafs))/2.0/Plot_tree.totalW,

Plot_tree.yOff)

Plot_midtext(cntrPt, parentPt, nodeTxt)

Plot_node(firstStr, cntrPt, parentPt, decisionNode)

secondDict = myTree[firstStr]

Plot_tree.yOff = Plot_tree.yOff - 1.0/Plot_tree.totalD

for key in secondDict.keys():

if type(secondDict[key]).__name__=='dict':

Plot_tree(secondDict[key],cntrPt,str(key))

else:

Plot_tree.xOff = Plot_tree.xOff + 1.0/Plot_tree.totalW

Plot_node(secondDict[key], (Plot_tree.xOff, Plot_tree.yOff),

cntrPt, leafNode)

Plot_midtext((Plot_tree.xOff, Plot_tree.yOff), cntrPt, str(key))

Plot_tree.yOff = Plot_tree.yOff + 1.0/Plot_tree.totalD

def Create_plot (myTree):

fig = plt.figure(1, facecolor = 'white')

fig.clf()

axprops = dict(xticks=[], yticks=[])

Create_plot.ax1 = plt.subplot(111, frameon=False, **axprops)

Plot_tree.totalW = float(Get_numleafs(myTree))

Plot_tree.totalD = float(Get_treedepth(myTree))

Plot_tree.xOff = -0.5/Plot_tree.totalW; Plot_tree.yOff = 1.0;

Plot_tree(myTree, (0.5,1.0), '')

plt.show()

def Plot_midtext(cntrPt, parentPt, txtString):

xMid = (parentPt[0] - cntrPt[0]) / 2.0 + cntrPt[0]

yMid = (parentPt[1] - cntrPt[1]) / 2.0 + cntrPt[1]

Create_plot.ax1.text(xMid, yMid, txtString)

def Classify(myTree, featLabels, testVec):

firstStr = myTree.keys()[0]

secondDict = myTree[firstStr]

featIndex = featLabels.index(firstStr)

for key in secondDict.keys():

if testVec[featIndex] == key:

if type(secondDict[key]).__name__ == 'dict' :

classLabel = Classify(secondDict[key],featLabels,testVec)

else:

classLabel = secondDict[key]

return classLabel

最后，可以測試我們的構造的決策樹分類器：

decisionNode = dict(boxstyle="sawtooth", fc="0.8")

leafNode = dict(boxstyle="round4", fc="0.8")

arrow_args = dict(arrowstyle="<-")

myData, featName = Create_data()

S_entrpy = Cal_entrpy(myData)

new_data = Split_dataset(myData, 0, 1)

best_feat = Choose_feature(myData)

myTree = Create_tree(myData, featName[:])

num_leafs = Get_numleafs(myTree)

depth = Get_treedepth(myTree)

Create_plot(myTree)

predict_label = Classify(myTree, featName, [1, 0])

print("the predict label is: ", predict_label)

print("the decision tree is: ", myTree)

print("the best feature index is: ", best_feat)

print("the new dataset: ", new_data)

print("the original dataset: ", myData)

print("the feature names are: ", featName)

print("the entrpy is:", S_entrpy)

print("the number of leafs is: ", num_leafs)

print("the dpeth is: ", depth)

print("All is well.")

構造的決策樹最后如下所示：

本站聲明：本文章由作者或相關機構授權發(fā)布，目的在于傳遞更多信息，并不代表本站贊同其觀點，本站亦不保證或承諾內容真實性等。需要轉載請聯(lián)系該專欄作者，如若文章內容侵犯您的權益，請及時聯(lián)系本站刪除。

換一批

阿維塔、賽力斯已入股！華為引望可能成“中國博世”

9月2日消息，不造車的華為或將催生出更大的獨角獸公司，隨著阿維塔和賽力斯的入局，華為引望愈發(fā)顯得引人矚目。

關鍵字：阿維塔塞力斯華為

[美通社全球TMT]

Trianz與AWS達成戰(zhàn)略合作協(xié)議，徹底改變云采用和管理方式

加利福尼亞州圣克拉拉縣2024年8月30日 /美通社/ -- 數(shù)字化轉型技術解決方案公司Trianz今天宣布，該公司與Amazon Web Services （AWS）簽訂了...

關鍵字： AWS AN BSP 數(shù)字化

[美通社全球TMT]

人工智能驅動工具SODA V將顛覆汽車市場，使汽車開發(fā)時間和成本降低90%

倫敦2024年8月29日 /美通社/ -- 英國汽車技術公司SODA.Auto推出其旗艦產(chǎn)品SODA V，這是全球首款涵蓋汽車工程師從創(chuàng)意到認證的所有需求的工具，可用于創(chuàng)建軟件定義汽車。 SODA V工具的開發(fā)耗時1.5...

關鍵字：汽車人工智能智能驅動 BSP

[美通社全球TMT]

從容應對未知風險----解密亞馬遜云科技的韌性之道

北京2024年8月28日 /美通社/ -- 越來越多用戶希望企業(yè)業(yè)務能7×24不間斷運行，同時企業(yè)卻面臨越來越多業(yè)務中斷的風險，如企業(yè)系統(tǒng)復雜性的增加，頻繁的功能更新和發(fā)布等。如何確保業(yè)務連續(xù)性，提升韌性，成...

關鍵字：亞馬遜解密控制平面 BSP

[通信先鋒]

中國游戲市場開始復蘇！騰訊、網(wǎng)易等巨頭縮減在日本投資

8月30日消息，據(jù)媒體報道，騰訊和網(wǎng)易近期正在縮減他們對日本游戲市場的投資。

關鍵字：騰訊編碼器 CPU

[通信先鋒]

獨立自主！華為董事：致力打造不依賴西方的技術

8月28日消息，今天上午，2024中國國際大數(shù)據(jù)產(chǎn)業(yè)博覽會開幕式在貴陽舉行，華為董事、質量流程IT總裁陶景文發(fā)表了演講。

關鍵字：華為 12nm EDA 半導體

[通信先鋒]

華為張平安：數(shù)字世界話語權最終由生態(tài)繁榮決定！

8月28日消息，在2024中國國際大數(shù)據(jù)產(chǎn)業(yè)博覽會上，華為常務董事、華為云CEO張平安發(fā)表演講稱，數(shù)字世界的話語權最終是由生態(tài)的繁榮決定的。

關鍵字：華為 12nm 手機衛(wèi)星通信

[美通社全球TMT]

中國通信服務公布2024年中期業(yè)績

要點：有效應對環(huán)境變化，經(jīng)營業(yè)績穩(wěn)中有升落實提質增效舉措，毛利潤率延續(xù)升勢戰(zhàn)略布局成效顯著，戰(zhàn)新業(yè)務引領增長以科技創(chuàng)新為引領，提升企業(yè)核心競爭力堅持高質量發(fā)展策略，塑強核心競爭優(yōu)勢...

關鍵字：通信 BSP 電信運營商數(shù)字經(jīng)濟

[美通社全球TMT]

NVI技術創(chuàng)新聯(lián)盟成立！自主生態(tài)將帶動產(chǎn)業(yè)鏈高速發(fā)展

北京2024年8月27日 /美通社/ -- 8月21日，由中央廣播電視總臺與中國電影電視技術學會聯(lián)合牽頭組建的NVI技術創(chuàng)新聯(lián)盟在BIRTV2024超高清全產(chǎn)業(yè)鏈發(fā)展研討會上宣布正式成立。活動現(xiàn)場 NVI技術創(chuàng)新聯(lián)...

關鍵字： VI 傳輸協(xié)議音頻 BSP

[美通社全球TMT]