您的当前位置：首页 Feature selection for classification of hyperspectral data by SVM

Feature selection for classification of hyperspectral data by SVM

来源：九壹网

IEEETRANSACTIONSONGEOSCIENCEANDREMOTESENSING,VOL.48,NO.5,MAY20102297

FeatureSelectionforClassiﬁcationof

HyperspectralDatabySVM

MaheshPalandGilesM.Foody,Member,IEEE

Abstract—Supportvectormachines(SVM)areattractivefortheclassiﬁcationofremotelysenseddatawithsomeclaimsthatthemethodisinsensitivetothedimensionalityofthedataand,therefore,doesnotrequireadimensionality-reductionanalysisinpreprocessing.Here,aseriesofclassiﬁcationanalyseswithtwohyperspectralsensordatasetsrevealsthattheaccuracyofaclassiﬁcationbyanSVMdoesvaryasafunctionofthenumberoffeaturesused.Critically,itisshownthattheaccuracyofaclassiﬁcationmaydeclinesigniﬁcantly(at0.05levelofstatisti-calsigniﬁcance)withtheadditionoffeatures,particularlyifasmalltrainingsampleisused.ThishighlightsadependenceoftheaccuracyofclassiﬁcationbyanSVMonthedimensionalityofthedataand,therefore,thepotentialvalueofundertakingafeature-selectionanalysispriortoclassiﬁcation.Additionally,itisdemonstratedthat,evenwhenalargetrainingsampleisavailable,featureselectionmaystillbeuseful.Forexample,theaccuracyderivedfromtheuseofasmallnumberoffeaturesmaybenon-inferior(at0.05levelofsigniﬁcance)tothatderivedfromtheuseofalargerfeaturesetprovidingpotentialadvantagesinrelationtoissuessuchasdatastorageandcomputationalprocessingcosts.Featureselectionmay,therefore,beavaluableanalysistoincludeinpreprocessingoperationsforclassiﬁcationbyanSVM.IndexTerms—Classiﬁcationaccuracy,featureselection,Hughesphenomenon,hyperspectraldata,supportvectormachines(SVM).

I.INTRODUCTION

ROGRESSinhyperspectralsensortechnologyallowsthemeasurementofradiationinthevisibletoinfraredspectralregioninmanyﬁnelyspacedspectralfeaturesorwavebands.Imagesacquiredbythesehyperspectralsensorsprovidegreaterdetailonthespectralvariationoftargetsthanthoseacquiredbyconventionalmultispectralsystems,providingthepotentialtoderivemoreinformationaboutdifferentobjectsintheareaim-aged[1].Analysisandinterpretationofdatafromthesesensorspresentnewpossibilitiesforapplicationssuchasland-coverclassiﬁcation[2].However,theavailabilityoflargeamountsofdataalsorepresentsachallengetoclassiﬁcationanalyses.Forexample,theuseofmanyfeaturesmayrequiretheestimationofaconsiderablenumberofparametersduringtheclassiﬁcationprocess[3].Ideally,eachfeature(e.g.,spectralwaveband)usedintheclassiﬁcationprocessshouldaddanindependentsetof

ManuscriptreceivedMay12,2009;revisedSeptember9,2009.FirstpublishedFebruary22,2010;currentversionpublishedApril21,2010.TheworkofDr.PalwassupportedbytheAssociationofCommonwealthUniver-sitieswithafellowshipattheUniversityofNottinghamcarriedoutduringtheperiodOctober2008–March2009.

M.PaliswiththeNationalInstituteofTechnology,Kurukshetra136119,India(e-mail:mpce_pal@yahoo.co.uk).

G.M.FoodyiswiththeSchoolofGeography,UniversityofNottingham,NG72RDNottingham,U.K.(e-mail:giles.foody@nottingham.ac.uk).DigitalObjectIdentiﬁer10.1109/TGRS.2009.2039484

information.Often,however,featuresarehighlycorrelated,andthiscansuggestadegreeofredundancyintheavailableinformationwhichmayhaveanegativeimpactonclassiﬁcationaccuracy[4].

Oneproblemoftennotedintheclassiﬁcationofhyperspec-traldataistheHugheseffectorphenomenon.Thelattercanhaveamajornegativeimpactontheaccuracyofaclassiﬁcation.Thekeycharacteristicsofthephenomenon,assumingaﬁxedtrainingset,maybeillustratedforatypicalscenarioinwhichfeaturesareincrementallyaddedtoaclassiﬁcationanalysis.Ini-tially,classiﬁcationaccuracyincreaseswiththeadditionofnewfeatures.Therateofincreaseinaccuracy,however,declines,andeventually,accuracywillbegintodecreaseasmorefeaturesareincluded.Althoughitmayatﬁrstseemcounterintuitivefortheprovisionofadditionaldiscriminatoryinformationtoresultinalossofaccuracy,theproblemisoftenencountered[5]–[7]andarisesasaconsequenceoftheanalysisrequiringtheestimationofmoreparametersfromthe(ﬁxed)trainingsample.Thus,theadditionoffeaturesmayleadtoareductioninclassiﬁcationaccuracy[8].

TheHughesphenomenonhasbeenobservedinmanyremotesensingstudiesbaseduponarangeofclassiﬁers[3],[5],[9],[10].Forexample,aparametrictechnique,suchasthemaximumlikelihoodclassiﬁer,maynotbeabletoclassifyadatasetaccuratelyiftheratioofsamplesizetonumberoffeaturesissmall,asitwillnotbeabletocorrectlyestimatetheﬁrst-andsecond-orderstatistics(i.e.,meanandcovariance)thatarefundamentaltotheanalysis[6].Notethat,withaﬁxedtrainingsetsize,thisratiodeclinesasthenumberoffeaturesisincreased.Thus,twokeyattributesofthetrainingsetareitssizeandﬁxednature.If,forexample,thetrainingsetwasnotﬁxedbutwasinsteadincreasedappropriatelywiththeadditionofnewfeatures,thephenomenonmaynotoccur.Similarly,iftheﬁxedtrainingsetsizewasverylargesothatevenwhenallfeaturesofahyperspectralsensorwereused,theHugheseffectmaynotbeobservedasallparametersmaybeestimatedadequately.Unfortunately,however,thesizeofthetrainingsetrequiredforaccurateparameterestimationmayexceedthatavailabletotheanalyst.Giventhattrainingdataacquisitionmaybedifﬁcultandcostly[11]–[13],somemeanstoaccommodatethenegativeissuesassociatedwithhigh-dimensionaldatasetsarerequired.

Variousapproachescouldbeadoptedfortheappropriateclassiﬁcationofhigh-dimensionaldata.ThesespanaspectrumfromtheadoptionofaclassiﬁerthatisrelativelyinsensitivetotheHugheseffect[14]throughtheuseofmethodstoeffec-tivelyincreasetrainingsetsize[5],[11]bytheapplicationofsomeformofdimensionality-reductionprocedurepriortothe

2298IEEETRANSACTIONSONGEOSCIENCEANDREMOTESENSING,VOL.48,NO.5,MAY2010

classiﬁcationanalysis.ItmayalsosometimesbeappropriatetouseacombinationofapproachestoreducethepossibilityoftheHugheseffectbeingobserved.Thepreciseapproachadoptedmayvarywithstudyobjectives,datasets,andclassiﬁcationapproach.OneclassiﬁcationmethodthathasbeenclaimedtobeindependentoftheHugheseffectandsopromotedforusewithhyperspectraldatasetsissupportvectormachines(SVM)[15],although,aswillbediscussedlater,thereissomeuncertaintyrelatingtotheroleoffeaturereductionwiththismethod.

TheSVMhasbecomeapopularmethodforimageclassiﬁ-cation.Itisbasedonstructuralriskminimizationandexploitsamargin-basedcriterionthatisattractiveformanyclassiﬁcationapplications[16].Incomparisonwithapproachesbasedonempiricalrisk,whichminimizethemisclassiﬁcationerroronthetrainingset,structuralriskminimizationseeksthesmallestprobabilityofmisclassifyingapreviouslyunseendatapointdrawnrandomlyfromaﬁxedbutunknownprobabilitydistrib-ution.Furthermore,anSVMtriestoﬁndanoptimalhyperplanethatmaximizesthemarginbetweenclassesbyusingasmallnumberoftrainingcases,thesupportvectors.ThecomplexityofSVMdependsonlyonthesesupportvectors,anditisarguedthatthedimensionalityoftheinputspacehasnoimportance[15],[17],[18].ThishypothesishasbeensupportedbyarangeofstudieswithSVM,suchasthoseemployingthepopularradialbasisfunction(RBF)kernelforland-coverclassiﬁcationapplications[19]–[21].

ThebasisoftheSVMandtheresultsofsomestudies,therefore,suggestthatSVMclassiﬁcationmaybeunaffectedbythedimensionalityofthedatasetand,therefore,thenumberoffeaturesused.However,otherstudieshaveshownthattheaccuracyofSVMclassiﬁcationcouldstillbeincreasedbyre-ducingthedimensionalityofthedataset[22],[23];hence,thereisadegreeofuncertaintyovertheroleoffeaturereductioninSVM-basedclassiﬁcation.Featurereduction,however,impactsonmorethanjusttheaccuracyofaclassiﬁcation.Afeature-reductionanalysismaybeundertakenforavarietyofreasons.Forexample,itmayspeeduptheclassiﬁcationprocessbyreducingdata-setsizeandmayincreasethepredictiveaccuracyaswellasabilitytounderstandtheclassiﬁcationrules[24].Itmayalsosimplyprovideadvantagesintermsofreducingdata-storagerequirements.Featurereductionmay,therefore,stillbeausefulanalysisevenifithasnopositiveeffectonclassiﬁcationaccuracy.

Twobroadcategoriesoffeature-reductiontechniquesarecommonlyencounteredinremotesensing:featureextractionandfeatureselection[25],[26].Withfeatureextraction,theoriginalremotelysenseddatasetistypicallytransformedinsomewaythatallowsthedeﬁnitionofasmallsetofnewfeatureswhichcontainthevastmajorityoftheoriginaldataset’sinformation.Morepopular,andthefocusofthispaper,arefeature-selectionmethods.Thelatteraimtodeﬁneasubsetoftheoriginalfeatureswhichallowstheclassestobedis-criminatedaccurately.Thatis,featureselectiontypicallyaimstoidentifyasubsetoftheoriginalfeaturesthatmaintainstheusefulinformationtoseparatetheclasseswithhighlycorre-latedandredundantfeaturesexcludedfromtheclassiﬁcationanalysis[25].Feature-selectionproceduresaredependentonthepropertiesoftheinputdataaswellasontheclassiﬁerused[27],[28].Theseproceduresrequirethatacriterionbedeﬁnedbywhichitispossibletojudgethequalityofeachfeatureintermsofitsdiscriminatingpower[29].Acomputationalprocedureisthenrequiredtosearchthroughtherangeofpotentialsubsetsoffeaturesandselectthe“best”subsetoffeaturesbaseduponsomepredeﬁnedcriterion.Thesearchprocedurecouldsimplyconsistofanexhaustivesearchoverallpossiblesubsetsoffeaturessincethisisguaranteedtoﬁndtheoptimalsubset.Inapracticalapplication,however,thecomputationalrequirementsofthisapproachareunreasonablylarge,andanonexhaustivesearchprocedureisusuallyused[30].Awidevarietyoffeature-selectionmethodshavebeenappliedtoremotelysenseddata[30]–[33].Basedonwhethertheyuseclassiﬁcationalgorithmstoevaluatesubsets,thedifferentmethodscanbegroupedintothreecategories:ﬁlters,wrappers,andembeddedapproaches.Theseapproachesmayselectdifferentsubsets,andthese,inturn,mayvaryinsuitabilityforuseasapreprocessingal-gorithmfordifferentclassiﬁers.Becauseofthesedifferencesandtherangeofreasonsforundertakingafeatureselection,aswellasthenumerousissuesthatinﬂuenceoutputsandimpactonlateranalyses,featureselectionremainsatopicforresearch[34].

AlthoughtheliteratureincludesclaimsthatclassiﬁcationbySVMisinsensitivetotheHugheseffect[19]–[21],[35],italsoincludescasestudiesusingsimulateddata[36],[37]andtheoreticalargumentsthatindicateapositiveroleforfeatureselectioninSVMclassiﬁcation[38],[39].BothBengioetal.[38]andFrancoisetal.[39]basedtheirargumentsontheuseoflocalkernels,suchasthepopularRBF,withkernel-basedclassiﬁersinwhichthecaseslyingintheneighborhoodofthecasebeingusedtocalculatethekernelvaluehavealargeinﬂu-ence[40].Intheirargument,Bengioetal.[38]usedthebias-variancedilemma[41]tosuggestthattheclassiﬁerswithlocalkernelwouldrequireexponentiallylargetrainingdatasettohavethesamelevelofclassiﬁcationerrorinhigh-dimensionalspaceasthatinalowerspace,suggestingthesensitivityofSVMclassiﬁertothecurseofdimensionality.Ontheotherhand,Francoisetal.[39]suggestedthatthelocalityofakernelisanimportantpropertythatmakesthegeneratedmodelmoreinter-pretableandusedanalgorithmmorestablethanthealgorithmsusingglobalkernels.TheyarguedthatanRBFkernellosesthepropertiesofalocalkernelwithincreasingfeaturespace,areasonwhytheymaybeunsuitableinhigh-dimensionalspace.Withthelatter,forexample,ithasbeenarguedthatclassiﬁersusinglocalkernelsaresensitivetothecurseofdimensionalityasthepropertiesoflearnedfunctionatacasedependsonitsneighbors,whichfailstoworkinhigh-dimensionalspace.Thereis,therefore,uncertaintyintheliteratureoverthesensi-tivityofclassiﬁcationbyanSVMtothedimensionalityofthedatasetand,therefore,ofthevalueoffeatureselectionwithinsuchananalysis.Thispaperaimstoaddresskeyaspectsofthisuncertaintyassociatedwiththeroleoffeatureselectionintheclassiﬁcationofhyperspectraldatasets.Speciﬁcally,thispaperaimstoexploretherelationshipbetweentheaccuracyofclas-siﬁcationbyanSVMandthedimensionalityoftheinputdata.Thelatterwillalsobecontrolledthroughapplicationofaseries

PALANDFOODY:FEATURESELECTIONFORCLASSIFICATIONOFHYPERSPECTRALDATA2299

offeature-selectionmethodsand,therefore,alsohighlighttheimpact,ifany,ofdifferentfeature-selectiontechniquesontheaccuracyofSVM-basedclassiﬁcation.Variationintheaccu-racyofclassiﬁcationsderivedusingfeaturesetsofdifferingsizewillbeevaluatedusingstatisticaltestsofdifferenceandnoninferiority[42],[43]inordertoevaluatethepotentialroleoffeatureselectioninSVM-basedclassiﬁcation.Thispaperis,toourknowledge,theﬁrstrigorousassessmentoftheHugheseffectonSVMwithhyperspectraldataset.Otherstudies(e.g.,[19]–[21])havecommentedontheHugheseffectinre-lationtotheSVM-basedclassiﬁcationofremotelysenseddata,butthispaperdiffersinthattheexperimentaldesignadoptedgivesanopportunityfortheeffecttooccur(e.g.,byincludinganalysesbasedonsmalltrainingsets),andthestatisticalsignif-icanceofdifferencesinaccuracyisevaluatedrigorously(e.g.,includingformaltestsforthedifferenceandnoninferiorityofaccuracy).Tosetthecontextofthispaper,SectionIIbrieﬂyoutlinestheclassiﬁcationbyanSVM.SectionIIIprovidesasummaryofthemainmethodsanddatasetsused.SectionIVpresentstheresults,andSectionVdetailstheconclusionsoftheresearchundertaken.

II.SVM

TheSVMisbasedonastatisticallearningtheory[14]and

seekstoﬁndanoptimalhyperplaneasadecisionfunctioninhigh-dimensionalspace[44],[45].Inthecaseofatwo-classpattern-recognitionprobleminwhichtheclassesarelinearlyseparable,theSVMselectsfromamongtheinﬁnitenumberoflineardecisionboundariestheonethatminimizesthegeneralizationerror.Thus,theselecteddecisionboundary(representedbyahyperplaneinfeaturespace)willbeonethatleavesthegreatestmarginbetweenthetwoclasses,wheremarginisdeﬁnedasthesumofthedistancestothehyperplanefromtheclosestcasesofthetwoclasses[14].Theproblemofmaximizingthemargincanbesolvedusingstandardquadraticprogrammingoptimizationtechniques.

ThesimplestscenarioforclassiﬁcationbyanSVMiswhentheclassesarelinearlyseparable.Thisscenariomaybeil-lustratedwiththetrainingdatasetcomprisingkcasesandberepresentedby{xi,yi},i=1,...,k,wherex∈RNisanN-dimensionalspaceandy∈{−1,+1}istheclasslabel.Thesetrainingpatternsarelinearlyseparableifthereexistsavectorw(determiningtheorientationofadiscriminatingplane)andascalarb(determiningtheoffsetofthediscriminatingplanefromtheorigin)suchthat

yi(w·xi+b)−1≥0.

(1)

Thehypothesisspacecanbedeﬁnedbythesetoffunctionsgivenby

fw,b=sign(w·x+b).

(2)

TheSVMﬁndstheseparatinghyperplanesforwhichthedistancebetweentheclasses,measuredalongalineperpendic-

ulartothehyperplane,ismaximized.Thiscanbeachievedbysolvingthefollowingconstrainedoptimizationproblem:

min1

󰀈w󰀈2w,b2

.(3)

Forlinearlynonseparableclasses,therestrictionthatalltrainingcasesofagivenclasslieonthesamesideoftheoptimalhyperplanecanberelaxedbytheintroductionofa“slackvariable”ξi≥0.Inthiscase,theSVMsearchesforthehyperplanethatmaximizesthemarginandthat,atthesametime,minimizesaquantityproportionaltothenumberofmisclassiﬁcationerrors.ThistradeoffbetweenmarginandmisclassiﬁcationerroriscontrolledbyapositiveconstantCsuchthat∞>C>0.Thus,fornonseparabledata,(3)canbewrittenas

󰀂1

w,b,ξmin

󰀈w󰀈2+C󰀄k󰀃ξi.(4)1,...ξk2i=1Fornonlineardecisionsurfaces,afeaturevectorx∈RNismappedintoahigherdimensionalEuclideanspace(featurespace)FviaanonlinearvectorfunctionΦ:RN→F[44].TheoptimalmarginprobleminFcanbewrittenbyreplacingxi·xjwithΦ(xi)·Φ(xj)whichiscomputationallyexpensive.Toaddressthisproblem,Vapnik[14]introducedtheconceptofusingakernelfunctionKinthedesignofnonlinearSVM.Akernelfunctionisdeﬁnedas

K(xi,xj)=Φ(xi)·Φ(xj)

(5)

andwiththeuseofakernel󰀇function,(2)becomes

󰀁f(x)=sign󰀄

λiyiK(xi,xj)+b

(6)

whereλiisaLagrangemultiplier.AdetaileddiscussionofthecomputationalaspectsofSVMcanbefoundin[14]and[45],withmanyexamplesalsointheremotesensingliterature[19],[21],[46],[47].

III.DATAANDMETHODS

A.TestAreas

Datasetsfortwostudyareaswereused.Theﬁrststudyarea,LaManchaAlta,liestothesouthofMadrid,Spain.ItisanareaofMediterraneansemiaridwetland,whichsupportsrain-fedcultivationofcropssuchaswheat,barley,vines,andolives.AhyperspectralimagedatasetwasacquiredforthetestsitebytheDigitalAirborneImagingSpectrometer(DAIS)7915sensoronJune29,2000.Thesensorwasa79-channelimagingspectrometerdevelopedandoperatedbytheGermanSpaceAgency[48].Thisinstrumentoperatedataspatialresolutionof5mandacquireddatainthewavelengthrangeof0.502–12.278μm.Attentionherefocusedonthedataacquiredinonlythevisibleandnear-infraredspectra.Thus,thedataacquiredinthesevenfeatureslocatedinthemid-andthermal-infraredregionswereremoved.Oftheremaining72featurescoveringspectralregion0.502–2.395μm,furthersevenfeatureswere

2300IEEETRANSACTIONSONGEOSCIENCEANDREMOTESENSING,VOL.48,NO.5,MAY2010

removedbecauseofstripingnoisedistortionsinthedata.Thefeaturesremovedwerebands41(1.948μm),42(1.9μm),and68–72(2.343–2.395μm).Afterthesepreprocessingopera-tions,anareaof512pixelsby512pixelsfromtheremaining65featurescoveringthetestsitewasextractedforfurtheranalysis.

ThesecondstudyareawasaregionofagriculturallandinIndiana,U.S.Forthissite,ahyperspectraldatasetacquiredbyAirborneVisible/InfraredImagingSpectrometer(AVIRIS)wasused.Thisdatasetisavailableonlinefrom[49].Thedatasetconsistsofasceneofsize145pixels×145columns.Ofthe220spectralbandsacquiredbytheAVIRISsensor,35wereremovedastheywereaffectedbynoise.Foreaseofpresentation,thebandsusedwererenumbered1–65and1–185inorderofincreasingwavelengthfortheDAISandAVIRISdatasets,respectively.

B.TrainingandTestingDataSets

FortheDAISdataset,ﬁeldobservationsofthetestsitewereundertakeninlateJune2001,exactlyoneyearaftertheimagedatawereacquired,togenerateaground-referencedataset.VisualexaminationoftheDAISimagerycombinedwithﬁeldexperienceshowedthattheregioncomprisedmainlyeightland-covertypes:wheat,water,saltlake,hydrophyticvegetation,vineyards,baresoil,pasture,andbuilt-upland.Aground-referenceimagewasgeneratedfromtheﬁeldinformation.WiththeAVIRISdataset,aground-referenceimageavailableon[49]wasusedtocollectthetrainingandtestpixelsforatotalofnineland-coverclasses(corn-notill,corn-mintill,grass/pasture,grass/trees,hay-windrowed,soybeans-notill,soybeans-mintill,soybean-clean,andwoods).Stratiﬁedrandomsampling,byclass,wasundertakeninordertocollectindependentdatasetsfortraining(upto100pixelsperclass)andtestingtheSVMclassiﬁcationsoftheDAISandAVIRISdatasets.

ToevaluatethesensitivityoftheSVMtotheHugheseffect,aseriesoftrainingsetsofdifferingsamplesizewasacquired.Thesedatasetswereformedbyselectingcasesrandomlyfromthetotalavailablefortrainingeachclass.Atotalofsixtrainingsetsizes,comprising8,15,25,50,75,and100pixelsperclass,wereused.Thesetrainingsamplesaretypicalofthesizesusedinremotesensingstudies(e.g.,[26],[46],and[50]–[53])butcriticallyalsoincludesmallsizesatwhichtheHugheseffectwouldbeexpectedtomanifestitself,ifatall.Foreachsizeoftrainingset,exceptthatusingall100pixelsavailableforeachclass,ﬁveindependentsampleswerederivedfromtheavailabletrainingdata.Eachoftheﬁvetrainingsetsofagivensizewasusedtotrainaclassiﬁcation,andtoavoidextremeresults,themainfocushereisontheclassiﬁcationwiththemedianaccuracy.

SVMclassiﬁcationsusingtrainingsetsofdifferingsizeswereundertakeninwhichthedimensionalityoftheinputdataset,indicatedbythenumberoffeaturesused,wasvaried.SincethemainconcernwastodetermineiftheHugheseffectwouldbeobservedandnotthedesignofanoptimalclassiﬁcation,mostattentionfocusedonthescenarioinwhichthefeatureswereenteredinasinglefashionforcomparativepurposes.Withthis,featureswereaddedincrementallyingroupsofﬁveinorderofwavelength.Thus,theﬁrstanalysisusedfeatures1–5,thesecondfeatures1–10,andsoonuntilallthe13thand37thanalyseswithDAISandAVIRISdata,respectively.AnumberofadditionalanalyseswereundertakenwithDAISdatainwhichfeatureswereaddedindividuallyinorderofdecreasingdiscriminatorypower(i.e.,thefeatureestimatedtoprovidemostdiscriminatoryinformationwasenteredﬁrst,andthatwhichprovidedtheleastdiscriminatoryinformationwasaddedlast).Irrespectiveofthemethodofincrementingfeatures,theaccuracywithwhichanindependenttestingsetwasclassiﬁedwascalculatedateachincrementalstep.

Classiﬁcationaccuracywasestimatedusingatestingsetthatcomprisedasampleof3800pixels(500pixelsforsevenclassesand300pixelsfortherelativelyscarcepastureclass)withtheDAISdataand3150pixels(350pixelsperclass)withtheAVIRISdatasets.Inallcases,accuracywasexpressedasthepercentageofcorrectlyallocatedcases.ThestatisticalsigniﬁcanceofdifferencesinaccuracywasassessedusingtheMcNemartestandconﬁdenceintervals[43],[54],[55].TwotypesoftestwereundertakentoelucidatetheeffectoffeatureselectiononSVMclassiﬁcationaccuracy.First,thestatisticalsigniﬁcanceofdifferencesinaccuracywasevaluated.ThistestingwasundertakenbecauseonecharacteristicfeatureofananalysisthatissensitivetotheHugheseffectisadecreaseinaccuracyfollowingtheinclusionofadditionalfeatures.Thus,thedetectionofastatisticallysigniﬁcantdecreaseinclassiﬁcationaccuracyfollowingtheadditionoffeaturestotheanalysiswouldbeanindicationofsensitivitytotheHugheseffect.Astandardone-sided(asthefocusisonadirectionalalternativehypothesis)testofthedifferenceinaccuracyvalueswasderivedusingtheMcNemartest[55].However,asfeatureselectionhaspositiveimpactsbeyondthoseassociatedwithclassiﬁcationaccuracy(e.g.,reduceddata-processingtimeandstoragerequirements),apositiverolewouldalsooccurifasmallfeaturesetcouldbeusedwithoutanysigniﬁcantlossofclassiﬁcationaccuracy.Thiscannotbeassessedwithatestfordifferenceasaresultindicatingnosigniﬁcantdifferenceinaccuracyisnotactuallyaproofofsimilarity[56].Indeed,inthissituation,thedesireisnottotestforasigniﬁcantdifferenceinaccuracybutrathertotestforthesimilarityinaccuracy,whichcouldbemetinthissituationthroughtheapplicationofatestfornoninferiority[42],[43].Inessence,theaimistodetermineifasmallfeatureset,whichprovidesadvantagestotheanalyst,canbeusedtoderiveaclassiﬁcationasaccurateasthatfromalarge,orindeed,fullfeatureset.Thelattertestfornoninferioritywasachievedusingtheconﬁdenceintervalﬁttedtotheestimateddifferencesinclassiﬁcationaccuracy[43].Forthepurposeofthispaper,itwasassumedthata1.00%declineinaccuracyfromthepeakvaluewasofnopracticalsigniﬁcance,andthisvalueistakentodeﬁnetheextentofthezoneofindifferenceinthetest.Critically,apositiveroleforfeature-selectionanalyseswouldbeindicatedifthetestfordifferencewassigniﬁcant(showingthataccuracycanbedegradedbytheadditionofnewfeatures)and/orifthetestfornoninferioritywassigniﬁcant(showingthatasmallfeaturesetderivesaclassiﬁcationasaccurateasthatfromtheuseofalargefeaturesetbutprovidingadvantagesinrelationtodatastorageandpro-cessing,etc.).

PALANDFOODY:FEATURESELECTIONFORCLASSIFICATIONOFHYPERSPECTRALDATA2301

C.Feature-SelectionAlgorithms

Fromtherangeoffeature-selectionmethodsavailable,fourestablishedmethods,includingonefromeachofthemaincat-egoriesofmethodsidentiﬁedearlier,wereappliedtotheDAISdata.Thesalientissuesofeachmethodarebrieﬂyoutlinednext.1)SVMRecursiveFeatureElimination(SVM-RFE):TheSVM-RFEisawrapper-basedapproachutilizingtheSVMasbaseclassiﬁer[22].TheSVM-RFEutilizestheobjectivefunction(1/2)󰀈w󰀈2asafeature-rankingcriteriontoproducealistoffeaturesorderedbyapparentdiscriminatoryability.Ateachstep,thecoefﬁcientsoftheweightvectorwareusedtocomputetherankingscoresofallfeaturesremaining.Thefeaturewiththesmallestrankingscore(wi)2iseliminated,wherewirepresentsthecorrespondingithcomponentofw.Thisapproachtofeatureselection,therefore,usesabackwardfeature-eliminationschemetorecursivelyremoveinsigniﬁcantfeatures(i.e.,ateachstep,thefeaturewhoseremovalchangestheobjectivefunctionleastisexcluded)fromsubsetsoffeaturesinordertoderivealistofallfeaturesinrankedorderofvalue.2)Correlation-BasedFeatureSelection(CFS):TheCFSisaﬁlteralgorithmthatselectsafeaturesubsetonthebasisofacorrelation-basedheuristicevaluationfunction[57].TheheuristicsbywhichCFSmeasuresthequalityofasetoffeaturestakeintoaccounttheusefulnessofindividualfeaturesforpredictingtheclassandcanbesummarizedas

󰀅fCci

f+f(f−1)C(7)

wherefisthenumberoffeaturesinthesubset,Cciisthemeanfeaturecorrelationwiththeclass,andCiiistheaveragefeatureintercorrelation.BothCciandCiiarecalculatedbyusingameasurebasedonconditionalentropy[58].Thenumeratorprovidesanindicationofhowpredictiveoftheclassagroupoffeaturesare,whereasthedenominatorindicatesabouttheredundancyamongthefeatures.Theevaluationcriterionusedinthisalgorithmisbiasedtowardthefeaturesubsetsthatarehighlypredictiveoftheclassandnotpredictiveofeachother.Thiscriterionactstoﬁlterouttheirrelevantfeaturesastheyhavelowcorrelationswiththeclass,andredundantfeaturesareignoredastheywillbehighlycorrelatedwithoneormorefeatures,thusprovidingasubsetofbestselectedfeatures.Inordertoreducethecomputationcost,abidirectionalsearch(aparallelimplementationofsequentialforwardandbackwardselections)maybeused.Thisapproachsearchesthespaceoffeaturesubsetsbygreedyhillclimbinginawaythatfeaturesal-readyselectedbysequentialforwardselectionarenotremovedbybackwardselection,andthefeaturesalreadyremovedbybackwardselectionarenotselectedbyforwardselection.

3)Minimum-Redundancy–Maximum-Relevance(mRMR):ThemRMRfeatureselectionisaﬁlter-basedmethodthatusesmutualinformationtodeterminethedependencebetweenthefeatures[59].ThemRMRusesacriterionwhichselectsfeaturesthataredifferentfromeachotherandstillhavethelargestdependenceonthetargetclass.ThisapproachconsistsinselectingafeaturefiamongthenotselectedfeaturesfSthatmaximizes(ui−ri),whereuiistherelevanceoffitotheclasscaloneandriisthemeanredundancyoffitoeachofthe

alreadyselectedfeatures.Intermsofmutualinformation,uiandricanbedeﬁnedas

ui=

1󰀄

|f|I(fi;c)

(8)fi∈f

r1󰀄i=|f|2I(fi,fj)

(9)

fj∈f

whereI(f;c)isthemutualinformationbetweenthetworan-domvariablesfandc.Ateachstep,thismethodselectsafeaturethathasthebestcompromisedrelevanceredundancyandcanbeusedtoproducearankedlistofallfeaturesintermsofdiscriminatingability.

4)RandomForest:Therandom-forest-basedapproachisanembeddedmethodoffeatureselection.Therandomforestconsistsofacollectionofdecision-treeclassiﬁers[60]whereeachtreeintheforesthasbeentrainedusingabootstrapsampleoftrainingdataandarandomsubsetoffeaturessampledindependentlyfromtheinputfeatures.Asubsetofthetrainingdatasetisomittedfromthetrainingofeachclassiﬁer[61].Theseleft-outdataarecalledout-of-bag(outofthebootstrap)samplesandareusedforfeatureselectionbydeterminingtheimportanceofdifferentfeaturesduringclassiﬁcationprocess[60],[62].ThelatterisbasedonaZscore,whichcanbeusedtoassignasigniﬁcancelevel(importancelevel)toafeature,andfromthis,arankedlistofallfeaturesmaybederived[60].D.Methods

SVMswereinitiallydesignedforbinaryclassiﬁcationprob-lems.Arangeofmethodshasbeensuggestedformulticlassclassiﬁcation[21],[63],[].Oneofthese,the“one-against-one”approach,wasusedhere[65]withbothhyperspectraldatasets.Throughout,anRBFkernelwasusedwithkernelwidthparameterγ=2andC=5000,valueswhichwereusedsuccessfullywiththeDAIShyperspectraldatasetinotherstudies[19],[20],[33],[66].ForanalysesoftheAVIRISdataset,anRBFkernelwithγ=1andregularizationparameterC=50wasused[66].

Withthefeatureselectionbyrandomforests,one-thirdofthetotaldatasetavailablefortrainingwasusedtoformtheout-of-bagsample.Therandom-forestclassiﬁeralsorequiresﬁndingtheoptimalvalueofanumberoffeaturesusedtogenerateatreeaswellasthetotalnumbersoftrees.Afterseveraltrials,13featuresand100treeswerefoundtobeworkingwellwiththeDAISdataset[33].

IV.RESULTS

TheaccuracyofclassiﬁcationbyanSVMvariedasafunc-tionofthenumberoffeaturesusedandthesizeofthetrainingsetusingtheDAISdataset(Fig.1).Ingeneralterms,classiﬁca-tionaccuracytendedtoincreasewithanincreaseinthenumberoffeatures.Critically,however,whenaﬁxedtrainingsetofsmallsize(≤25casesperclass)wasused,theaccuracyinitiallyrosewiththeadditionoffeaturestoapeak,butthereafterdeclinedwiththeadditionoffurtherfeatures.Moreover,the

2302IEEETRANSACTIONSONGEOSCIENCEANDREMOTESENSING,VOL.48,NO.5,MAY2010

PALANDFOODY:FEATURESELECTIONFORCLASSIFICATIONOFHYPERSPECTRALDATA2303

TABLEIII

RESULTSOFTHEAPPLICATIONOFTHEFOURFEATURE-SELECTIONMETHODSUSINGDAISDATASETHIGHLIGHTINGTHECHARACTERISTICS

OFTHECLASSIFICATIONBASEDONEACHTRAININGSETSIZETHATWASOFMOSTCOMPARABLEACCURACY

WITHTHATDERIVEDWITHOUTFEATURESELECTION

2304IEEETRANSACTIONSONGEOSCIENCEANDREMOTESENSING,VOL.48,NO.5,MAY2010

PALANDFOODY:FEATURESELECTIONFORCLASSIFICATIONOFHYPERSPECTRALDATA2305

TABLEVI

DIFFERENCEANDNONINFERIORITYTESTRESULTSBASEDON95%CONFIDENCEINTERVALONTHEESTIMATEDDIFFERENCEINACCURACY

FROMTHEPEAKVALUEFORFEATURESETSSELECTEDWITHTHESVM-RFEUSINGDAISDATASET:BASEDONTRAININGSET

OF100CASESPERCLASSWITHPEAKACCURACY

OF93.13%WITH35FEATURES

2306IEEETRANSACTIONSONGEOSCIENCEANDREMOTESENSING,VOL.48,NO.5,MAY2010

trainingsets(≤25casesperclass).However,evenwithalargetrainingsampleusingtheDAISdataset,featureselectionmayhaveapositiverole,providingareduceddatasetthatmaybeusedtoyieldaclassiﬁcationofsimilaraccuracytothatderivedfromuseofamuchlargerfeatureset.AstheaccuracyofSVMclassiﬁcationwasdependentonthedimensionalityofthedatasetandthesizeofthetrainingset,itmaythereforebebeneﬁcialtoundertakeafeature-selectionanalysispriortoaclassiﬁcationanalysis.Theresults,however,alsohighlightthatthechoiceofthefeature-selectionmethodsmaybeimportant.Forexample,theresultsderivedfromanalyseswiththefourdifferentfeature-selectionmethodsshowthatthenumberoffeaturesselectedvariedgreatly.

ACKNOWLEDGMENT

TheauthorswouldliketothankProf.J.GumuzziooftheAutonomousUniversityofMadrid,Spain,formakingavailabletheDAISdatathatwerecollectedandprocessedbyDLRandalsothethreerefereesfortheirconstructivecommentsontheoriginalversionofthispaper.M.PalwouldliketothanktheSchoolofGeography,UniversityofNottingham,forthecomputingfacilities.

REFERENCES

[1]C.-IChang,HyperspectralDataExploitation:TheoryandApplications.

Hoboken,NJ:Wiley,2007.

[2]J.B.Campbell,IntroductiontoRemoteSensing,3rded.NewYork:

GuilfordPress,2002.

[3]J.A.BenediktssonandJ.R.Sveinsson,“Featureextractionformulti-sourcedataclassiﬁcationwithartiﬁcialneuralnetworks,”Int.J.RemoteSens.,vol.18,no.4,pp.727–740,Mar.1997.

[4]P.Zhong,P.Zhang,andR.Wang,“DynamiclearningofSMLRforfeature

selectionandclassiﬁcationofhyperspectraldata,”IEEEGeosci.RemoteSens.Lett.,vol.5,no.2,pp.280–284,Apr.2008.

[5]B.M.ShahshahaniandD.A.Landgrebe,“Theeffectofunlabeledsamples

inreducingthesmallsamplesizeproblemandmitigatingtheHughesphe-nomenon,”IEEETrans.Geosci.RemoteSens.,vol.32,no.5,pp.1087–1095,Sep.1994.

[6]S.TadjudinandD.A.Landgrebe,“Covarianceestimationwithlimited

trainingsamples,”IEEETrans.Geosci.RemoteSens.,vol.37,no.4,pp.2113–2118,Jul.1999.

[7]M.Chi,R.Feng,andL.Bruzzone,“Classiﬁcationofhyperspectral

remote-sensingdatawithprimalSVMforsmall-sizedtrainingdatasetproblem,”Adv.SpaceRes.,vol.41,no.4,pp.1793–1799,2008.

[8]G.F.Hughes,“Onthemeanaccuracyofstatisticalpatternrecognizers,”

IEEETrans.Inf.Theory,vol.IT-14,no.1,pp.55–63,Jan.1968.

[9]S.Lu,K.Oki,Y.Shimizu,andK.Omasa,“Comparisonbetweenseveral

featureextraction/classiﬁcationmethodsformappingcomplicatedagri-culturallandusepatchesusingairbornehyperspectraldata,”Int.J.RemoteSens.,vol.28,no.5,pp.963–984,Jan.2007.

[10]S.TadjudinandD.A.Landgrebe,“Adecisiontreeclassiﬁerdesignfor

high-dimensionaldatawithlimitedtrainingsamples,”inProc.IEEEGeosci.RemoteSens.Symp.,May27–31,1996,vol.1,pp.790–792.[11]M.ChiandL.Bruzzone,“Asemilabeled-sample-drivenbaggingtech-niqueforill-posedclassiﬁcationproblems,”IEEEGeosci.RemoteSens.Lett.,vol.2,no.1,pp.69–73,Jan.2005.

[12]P.Mantero,G.Moser,andS.B.Serpico,“Partiallysupervisedclassiﬁ-cationofremotesensingimagesthroughSVM-basedprobabilitydensityestimation,”IEEETrans.Geosci.RemoteSens.,vol.43,no.3,pp.559–570,Mar.2005.

[13]G.M.FoodyandA.Mathur,“Towardintelligenttrainingofsupervised

imageclassiﬁcations:DirectingtrainingdataacquisitionforSVMclassi-ﬁcation,”RemoteSens.Environ.,vol.93,no.1/2,pp.107–117,Oct.2004.[14]V.N.Vapnik,TheNatureofStatisticalLearningTheory.NewYork:

Springer-Verlag,1995.

[15]C.CortesandV.N.Vapnik,“Support-vectornetworks,”Mach.Learn.,

vol.20,no.3,pp.273–297,Sep.1995.

[16]V.N.Vapnik,EstimationofDependencesBasedonEmpiricalData.

NewYork:Springer-Verlag,1982.

[17]D.M.J.Tax,D.deRidder,andR.P.W.Duin,“Supportvectorclassiﬁers:

Aﬁrstlook,”inProc.3rdAnnu.Conf.Adv.SchoolComput.Imaging,H.E.Bal,H.Corporaal,P.P.Jonker,andJ.F.M.Tonino,Eds.,Heijen,TheNetherlands,Jun.2–4,1997,pp.253–258.

[18]J.A.Gualtieri,“Thesupportvectormachine(SVM)algorithmfor

supervisedclassiﬁcationofhyperspectralremotesensingdata,”inKernelMethodsforRemoteSensingDataAnalysis,G.Camps-VallsandL.Bruzzone,Eds.Chichester,U.K.:Wiley,2009.

[19]M.PalandP.M.Mather,“Assessmentoftheeffectivenessofsupportvec-tormachinesforhyperspectraldata,”FutureGenerationComput.Syst.,vol.20,no.7,pp.1215–1225,Oct.2004.

[20]M.PalandP.M.Mather,“SomeissuesinclassiﬁcationofDAIShy-perspectraldata,”Int.J.RemoteSens.,vol.27,no.14,pp.25–2916,Jul.2006.

[21]F.MelganiandL.Bruzzone,“Classiﬁcationofhyperspectralremotesens-ingimageswithsupportvectormachines,”IEEETrans.Geosci.RemoteSens.,vol.42,no.8,pp.1778–1790,Aug.2004.

[22]I.Guyon,J.Weston,S.Barnhill,andV.N.Vapnik,“Geneselection

forcancerclassiﬁcationusingsupportvectormachines,”Mach.Learn.,vol.46,no.1–3,pp.3–422,Jan.2002.

[23]A.GiduduandH.Ruther,“Comparisonoffeatureselectiontechniques

forSVMclassiﬁcation,”inProc.10thInt.Symp.Phys.Meas.SpectralSignaturesRemoteSens.,vol.XXXVI,Intl.ArchivesofthePhotogramme-try,RemoteSensingandSpatialInformationSciences,M.E.Schaepman,S.Liang,N.E.Groot,andM.Kneubühler,Eds.,Davos,Switzerland,2007,pp.258–263.

[24]H.Liu,“Evolvingfeatureselection,”IEEEIntell.Syst.,vol.20,no.6,

pp.–76,Nov.2005.

[25]H.LiuandH.Motoda,FeatureExtraction,ConstructionandSelection:A

DataMiningPerspective.Norwell,MA:Kluwer,1998.

[26]P.M.Mather,ComputerProcessingofRemotely-SensedImages:An

Introduction,3rded.Chichester,U.K.:Wiley,2004.

[27]R.KohaviandG.H.John,“Wrappersforfeaturesubsetselection,”Artif.

Intell.,vol.97,no.1/2,pp.273–324,Mar.1997.

[28]I.GuyonandA.Elisseeff,“Anintroductiontovariableandfeatureselec-tion,”J.Mach.Learn.Res.,vol.3,no.7/8,pp.1157–1182,Mar.2003.[29]M.DashandH.Liu,“Featureselectionforclassiﬁcation,”Intell.Data

Anal.,Int.J.,vol.1,no.3,pp.131–156,1997.

[30]A.JainandD.Zongker,“Featureselection:Evaluation,application,and

smallsampleperformance,”IEEETrans.PatternAnal.Mach.Intell.,vol.19,no.2,pp.153–158,Feb.1997.

[31]T.KavzogluandP.M.Mather,“Theroleoffeatureselectioninartiﬁ-cialneuralnetworkapplications,”Int.J.RemoteSens.,vol.23,no.15,pp.2787–2803,Aug.2002.

[32]S.B.SerpicoandL.Bruzzone,“Anewsearchalgorithmforfeature

selectioninhyperspectralremotesensingimages,”IEEETrans.Geosci.RemoteSens.,vol.39,no.7,pp.1360–1367,Jul.2001.

[33]M.Pal,“Supportvectormachine-basedfeatureselectionforlandcover

classiﬁcation:AcasestudywithDAIShyperspectraldata,”Int.J.RemoteSens.,vol.27,no.14,pp.2877–24,Jul.2006.

[34]J.LoughreyandP.Cunningham,“Overﬁttinginwrapper-basedfeature

subsetselection:Theharderyoutrytheworseitgets,”inResearchandDevelopmentinIntelligentSystemsXXI,M.Bramer,F.Coenen,andT.Allen,Eds.London,U.K.:Springer-Verlag,2004,pp.33–43.

[35]G.H.Halldorsson,J.A.Benediktsson,andJ.R.Sveinsson,“Source-basedfeatureextractionforsupportvectormachinesinhyperspectralclassiﬁcation,”inProc.IEEEGeosci.RemoteSens.Symp.,Sep.20–24,2004,vol.1,pp.536–539.

[36]O.BarzilayandV.L.Brailovsky,“Ondomainknowledgeandfeature

selectionusingasupportvectormachine,”PatternRecognit.Lett.,vol.20,no.5,pp.475–484,May1999.

[37]A.Navot,R.Gilad-Bachrach,Y.Navot,andN.Tishby,“IsFeatureSe-lectionStillNecessary?”LectureNotesinComputerScience.vol.3940,Berlin,Germany:Springer-Verlag,2006,pp.127–138.

[38]Y.Bengio,O.Delalleau,andN.LeRoux,“Thecurseofhighlyvariable

functionsforlocalkernelmachines,”inAdvancesinNeuralInforma-tionProcessingSystems,vol.18.Cambridge,MA:MITPress,2006,pp.107–114.

[39]D.Francois,V.Wertz,andM.Verleysen,“Aboutthelocalityofkernels

inhighdimensionalspace,”inProc.Int.Symp.Appl.StochasticModelsDataAnal.,Brest,France,May17–20,2005,pp.238–245.

[40]B.Scholkopf,S.Mika,C.J.C.Burges,P.Knirsch,K.R.Muller,

G.Ratsch,andA.J.Smola,“Inputspaceversusfeaturespaceinkernel-basedmethods,”IEEETrans.NeuralNetw.,vol.10,no.5,pp.1000–1017,Sep.1999.

PALANDFOODY:FEATURESELECTIONFORCLASSIFICATIONOFHYPERSPECTRALDATA2307

[41]S.Geman,E.Bienenstock,andR.Doursat,“Neuralnetworksand

thebias/variancedilemma,”NeuralComput.,vol.4,no.1,pp.1–58,Jan.1992.

[42]J.L.Fleiss,B.Levin,andM.C.Paik,StatisticalMethodsforRates&

Proportions,3rded.NewYork:Wiley-Interscience,2003.

[43]G.M.Foody,“Classiﬁcationaccuracycomparison:Hypothesistestsand

theuseofconﬁdenceintervalsinevaluationsofdifference,equivalenceandnon-inferiority,”RemoteSens.Environ.,vol.113,no.8,pp.1658–1663,Aug.2009.

[44]B.Boser,I.Guyon,andV.N.Vapnik,“Atrainingalgorithmforoptimal

marginclassiﬁers,”inProc.5thAnnu.WorkshopComput.Learn.Theory,1992,pp.144–152.

[45]N.CristianiniandJ.Shawe-Taylor,AnIntroductiontoSupportVectorMa-chinesandOtherKernel-BasedLearningMethods.Cambridge,U.K.:CambridgeUniv.Press,2000.

[46]G.M.FoodyandA.Mathur,“Arelativeevaluationofmulticlassimage

classiﬁcationbysupportvectormachines,”IEEETrans.Geosci.RemoteSens.,vol.42,no.6,pp.1335–1343,Jun.2004.

[47]G.Camps-VallsandL.Bruzzone,KernelMethodsforRemoteSensing

DataAnalysis.Chichester,U.K.:Wiley.

[48]P.Strobl,R.Richter,F.Lehmann,A.Mueller,B.Zhukov,andD.Oertel,

“PreprocessingfortheairborneimagingspectrometerDAIS7915,”Proc.SPIE,vol.2758,pp.375–382,Jun.1996.

[49]AVIRISNWIndiana’sIndianPines,1992.dataset,ftp://ftp.ecn.purdue.

edu/biehl/MultiSpec/92AV3C.lan(originalﬁles)andftp://ftp.ecn.purdue.edu/biehl/PC_MultiSpec/ThyFiles.zip(groundtruth).

[50]G.M.FoodyandM.K.Arora,“Anevaluationofsomefactorsaffecting

theaccuracyofclassiﬁcationbyanartiﬁcialneuralnetwork,”Int.J.RemoteSens.,vol.18,no.4,pp.799–810,Mar.1997.

[51]G.M.Foody,A.Mathur,C.Sanchez-Hernandez,andD.S.Boyd,“Train-ingsetsizerequirementsfortheclassiﬁcationofaspeciﬁcclass,”RemoteSens.Environ.,vol.104,no.1,pp.1–14,Sep.2006.

[52]M.PalandP.M.Mather,“Anassessmentoftheeffectivenessofdecision

treemethodsforlandcoverclassiﬁcation,”RemoteSens.Environ.,vol.86,no.4,pp.554–565,Oct.2003.

[53]T.G.VanNiel,T.R.McVicar,andB.Datt,“Ontherelationshipbetween

trainingsamplesizeanddatadimensionalityofbroadbandmulti-temporalclassiﬁcation,”RemoteSens.Environ.,vol.98,no.4,pp.468–480,Oct.2005.

[54]T.G.Dietterich,“Approximatestatisticaltestsforcomparingsuper-visedclassiﬁcationlearningalgorithms,”NeuralComput.,vol.10,no.7,pp.15–1923,Oct.1998.

[55]G.M.Foody,“Thematicmapcomparison:Evaluatingthestatisticalsig-niﬁcanceofdifferencesinclassiﬁcationaccuracy,”Photogramm.Eng.RemoteSens.,vol.70,no.5,pp.627–633,May2004.

[56]D.G.AltmanandJ.M.Bland,“Absenceofevidenceisnotevidenceof

absence,”Brit.Med.J.,vol.311,no.7003,p.485,Aug.1995.

[57]M.A.HallandL.A.Smith,“Featuresubsetselection:Acorrelation-based

ﬁlterapproach,”inProc.Int.Conf.NeuralInf.Process.Intell.Inf.Syst.,1997,pp.855–858.

[58]W.H.Press,NumericalRecipes.Cambridge,U.K.:CambridgeUniv.

Press,1988.

[59]H.Peng,F.Long,andC.Ding,“Featureselectionbasedonmu-tualinformation:Criteriaofmax-dependency,max-relevance,andmin-redundancy,”IEEETrans.PatternAnal.Mach.Intell.,vol.27,no.8,pp.1226–1238,Aug.2005.

[60]L.Breiman,“Randomforests,”Mach.Learn.,vol.45,no.1,pp.5–32,

Oct.2001.

[61]L.Breiman,“Baggingpredictors,”Mach.Learn.,vol.24,no.2,pp.123–

140,Aug.1996.

[62]R.Díaz-UriarteandS.A.deAndrés,“Geneselectionandclassiﬁcation

ofmicroarraydatausingrandomforest,”BMCBioinf.,vol.7,no.1,p.3,2006.

[63]C.-W.HsuandC.-J.Lin,“Acomparisonofmethodsformulti-classsup-portvectormachines,”IEEETrans.NeuralNetw.,vol.13,no.2,pp.415–425,Mar.2002.

[]M.Pal,“Multiclassapproachesforsupportvectormachinebasedland

coverclassiﬁcation,”inProc.8thAnnu.Int.Conf.,MapIndia,2005.[Online].Available:http://www.gisdevelopment.net/technology/rs/mi0554.htm.[Accessed:Dec.12,2008].

[65]S.Knerr,L.Personnaz,andG.Dreyfus,“Single-layerlearningrevis-ited:Astepwiseprocedureforbuildingandtrainingneuralnetwork,”inNeurocomputing:Algorithms,ArchitecturesandApplications.Berlin,Germany:Springer-Verlag,1990.

[66]M.Pal,“Margin-basedfeatureselectionforhyperspectraldata,”Int.J.

Appl.EarthObserv.Geoinf.,vol.11,no.3,pp.212–220,Jun.2009.

[67]T.M.Cover,“Thebesttwoindependentmeasurementsarenotthetwo

best,”IEEETrans.Syst.,Man,Cybern.,vol.SMC-4,no.1,pp.116–117,Jan.1974.

MaheshPalreceivedthePh.D.degreefromtheUni-versityofNottingham,Nottingham,U.K.,in2002.HeiscurrentlyanAssociateProfessorwiththeDepartmentofCivilEngineering,NationalInstituteofTechnology,Kurukshetra,India.Hismajorre-searchareasareland-coverclassiﬁcation,featureselection,andapplicationofartiﬁcialintelligencetechniquesinvariouscivilengineeringapplication.Dr.PalisintheeditorialboardoftherecentlylaunchedjournalRemoteSensingLetters.

GilesM.Foody(M’01)receivedtheB.Sc.andPh.D.degreesingeographyfromtheUniversityofShefﬁeld,ShefﬁeldU.K.,in1983and1986,respectively.

HeiscurrentlyaProfessorofgeographicalinfor-mationsciencewiththeUniversityofNottingham,Nottingham,U.K.Hismainresearchinterestsfocusontheinterfacebetweenremotesensing,ecology,andinformatics.

Dr.FoodyiscurrentlytheEditor-in-ChiefoftheInternationalJournalofRemoteSensingandofthe

recentlylaunchedjournalRemoteSensingLetters.HeholdseditorialroleswithLandscapeEcologyandEcologicalInformaticsandservesontheeditorialboardofseveralotherjournals.HewastherecipientoftheRemoteSensingandPhotogrammetrySociety’sAward,itshighestaward,forservicestoremotesensingin2009.

因篇幅问题不能全部显示，请点此查看更多更全内容

查看全文