ExpertSystemswithApplications33(2007)86–95
www.elsevier.com/locate/eswa
Semantic-basedfacialexpressionrecognition
usinganalyticalhierarchyprocess
Shyi-ChyiCheng
aba,*,Ming-YaoChenb,Hong-YiChangb,Tzu-ChuanChou
cDepartmentofComputerScience,NationalTaiwanOceanUniversity,Taiwan
DepartmentofComputerandCommunicationEngineering,NationalKaohsiungFirstUniversityofScienceandTechnology,Taiwan
cDepartmentofInformationManagement,NationalTaiwanUniversityofScienceandTechnology,Taiwan
Abstract
Inthispaperwepresentanautomaticfacialexpressionrecognitionsystemthatutilizesasemantic-basedlearningalgorithmusingtheanalyticalhierarchyprocess(AHP).Alltheautomaticfacialexpressionrecognitionmethodsaresimilarinthattheyfirstextractsomelow-levelfeaturesfromtheimagesorvideo,thenthesefeaturesareusedasinputsintoaclassificationsystem,andtheoutcomeisoneofthepreselectedemotioncategories.Althoughtheeffectivenessoflow-levelfeaturesinautomaticfacialexpressionrecognitionsys-temshasbeenwidelystudied,thesuccessisshadowedbytheinnatediscrepancybetweenthemachineandhumanperceptiontotheimage.Thegapbetweenlow-levelvisualfeaturesandhigh-levelsemanticsshouldbebridgedinaproperwayinordertoconstructaseamlessautomaticfacialexpressionsystemsatisfyingtheuserperception.Forthispurpose,weusetheAHPtoprovideasystematicalwaytoevaluatethefitnessofasemanticdescriptionforinterpretingtheemotionofafaceimage.Asemantic-basedlearningalgorithmisalsoproposedtoadapttheweightsoflow-levelvisualfeaturesforautomaticfacialexpressionrecognition.Theweightsarechosensuchthatthediscrepancybetweenthefacialexpressionrecognitionresultsobtainedintermsoflow-levelfeaturesandhigh-levelsemanticdescriptionissmall.Intherecognitionphase,onlythelow-levelfeaturesareusedtoclassifytheemotionofaninputfaceimage.Theproposedsemanticlearningschemeprovidesawaytobridgethegapbetweenthehigh-levelsemanticconceptandthelow-levelfeaturesforautomaticfacialexpressionrecognition.Experimentalresultsshowthattheperformanceoftheproposedmethodisexcellentwhenitiscomparedwiththatoftraditionalfacialexpressionrecognitionmethods.Ó2006ElsevierLtd.Allrightsreserved.
Keywords:Facialexpressionrecognition;Low-levelvisualfeature;High-levelsemanticconcept;Analyticalhierarchyprocess;Semanticlearning
1.Introduction
Thecommonmethodsformostofcurrenthuman–com-puterinteraction(HCI)techniquesarethroughthemodal-itiessuchas,keypress,mousemovement,orspeechinput.TheseHCItechniquesdonotprovidenaturalhuman-to-human-likecommunication.Theinformationaboutemo-tionsandthementalstateofapersoncontainedinhumanfacesisusuallyignored.Duetotheadvancesofartificialintelligenttechniquesinthepastdecades,itispossibletoenablecommunicationwithcomputersinanaturalway,
*Correspondingauthor.Tel.:+886224622192;fax:+886224623249.E-mailaddress:csc@mail.ntou.edu.tw(S.-C.Cheng).
similartoeverydayinteractionbetweenpeople,usinganautomaticfacialexpressionrecognitionsystem(Fasel&Luettin,2003).
Sincetheearly1970s,EkmanandFriesen(Ekman&Friesen,1978)hadperformedextensivestudiesofhumanfacialexpressionsanddefinedsixbasicemotions(happi-ness,sadness,fear,disgust,surprise,andanger).Eachofthesesixbasicemotionscorrespondstoauniquefacialexpression.TheyalsodefinedtheFacialActionCodingSys-tem(FACS),asystemprovidesasystematicalwaytoana-lyzefacialexpressionsthroughstandardizedcodingofchangesinfacialmotion.FACSconsistsof46ActionUnits(AUs)whichdescribebasicfacialmovements.Ekman’sworkinspiredmanyresearcherstoanalyzefacialfeatures
0957-4174/$-seefrontmatterÓ2006ElsevierLtd.Allrightsreserved.doi:10.1016/j.eswa.2006.04.019
S.-C.Chengetal./ExpertSystemswithApplications33(2007)86–9587
usingimageandvideoprocessing.Bytrackingfacialfea-turesandmeasuringtheamountoffacialmovements,theyattempttocategorizedifferentfacialexpressions.Basedonthesebasicexpressionsorasubsetofthem,Suwa,Sugie,andFujimora(1978),andMaseandPentland(1991)per-formedearlyworkonautomaticfacialexpressionanalysis.Detailedreviewofmanyoftherecentworkonfacialexpres-sionanalysiscanrefertoFaselandLuettin(2003)andPan-ticandRothkrantz(2000).Allthesemethodsaresimilarinthattheyfirstextractsomefeaturesfromimageorvideo,thenthesefeaturesareusedasinputsintoaclassificationsystem,andtheoutcomeisoneofthepreselectedemotioncategories.Theydiffermainlyinthefeaturesextractedandintheclassifiersusedtoansweraninputfaceimage.Facialfeaturesusedforautomaticfacialexpressionanal-ysiscanbeobtainedusingimageprocessingtechniques.Ingeneral,thedimensionalityofthelow-levelvisualfeaturesusedtodescribeafaceexpressionishigh.PrincipalCompo-nentAnalysis(PCA),LinearDiscriminantAnalysis(LDA),DiscreteCosineTransform(DCT),WaveletTransform,etc.,arethecommonlyusedtechniquesfordatareductionandfeatureextraction(Calder,Burton,Miller,&Young,2001;Draper,Baek,Bartlett,&Beveridge,2003;Jeng,Yao,Han,Chern,&Liu,1993;Lyons,Budynek,&Akama-tsu,1999;Martinez&Kak,2001;Saxena,Anand,&Mukerjee,2004).Suchvisualfeaturescontainthemostdis-criminativeinformationandprovidemorereliabletrainingofclassificationsystems.Itisimportanttonormalizethevaluesthatcorrespondtofacialfeatureschangesusingthefacialfeaturesextractedfromtheperson’sneutralfaceinordertoconstructaperson-independentautomaticfacialexpressionrecognitionsystem.FACShasbeenusedtodescribevisualfeaturesinfacialexpressionrecognitionsys-tem(Tian,Kanad,&Cohn,2001).Furthermore,thelow-levelfacialfeatures–FacialAnimationParameters(FAPs)supportedbyMPEG-4standardarealsowidelyusedinautomaticfacialexpressionrecognition(Aleksic&Katsag-gelos,2004;Donato,Hager,Bartlett,Ekman,&Sejnowski,1999;Essa&Pentland,1997;Pardas&Bonafonte,2002).Fig.1showstheFAPsthatcontainsignificantinformationaboutfacialexpressionscontrollingeyebrow(group4)andmouthmovement(group8)(TextforISO/IECFDIS14496-2Visual,1998).
Inrecentwork,theapproachesforautomaticfacialfea-turerecognitioncanbeclassifiedintothreecategories
Fig.1.Outer-lipandeyebrowFAPs(Tianetal.,2001).
(Fasel&Luettin,2003).Intheimage-basedapproach,thewholefaceimage,orimagesofpartsoftheface,areprocessedinordertoobtainvisualfeatures.Theweightingsofdifferentpartsofthefaceshouldbedifferenttoimprovetheperformance.Forexample,thenosemovementobvi-ouslycontainslessinformationthaneyebrowandmouthmovementaboutfacialexpressions.Hence,theweightingofnosemovementshouldbedecreasedinordertoimprovetherecognitionaccuracy.Onthebasisofdeformationextraction,thefacialexpressionrecognitionprocessiscon-ductedthroughthedeformationinformationofeachpartoftheface.ThemodelstoextractdeformationinformationincludeActiveShapeModelandPointDistributionModel.Thecommonprocessforthesemodelsistoestimatethemotionvectorsofthefeaturepoints.Themotionvectorsarethenusedtorecognizefacialexpressions.Thedisadvan-tagesoftheapproachinclude(1)thefeaturepointsareusu-allysensitivetonoise(i.e.,lightconditionchange)andhenceunstable;(2)thecomputationalcomplexityofmotionestimationishigh.Inthegeometric-analysisapproach,theshapeandpositionofeachpartofthefaceareusedtorepresentthefaceforexpressionclassificationandrecognition.
Facialexpressionrecognitionisperformedbyaclassifier,whichoftenconsistsofmodelsofpatterndistri-bution,coupledtoadecisionprocedure.Awiderangeofclassifiers,coveringparametricaswellasnon-parametrictechniques,havebeenappliedtotheautomaticfacialexpressionrecognitionproblem(Fasel&Luettin,2003;Pantic&Rothkrantz,2000).Neuralnetworks(Tianetal.,2001),hiddenMarkovmodels(Aleksic&Katsagge-los,2004;Pardas&Bonafonte,2002),k-nearestneighborclassifiers(Bourel,Chibelushi,&Low,2002),etc.arecom-monlyusedtoperformclassification.
Althoughtherapidadvanceoffaceimageprocessingtechniques,suchasfacedetectionandfacerecognition,providesagoodstartingpointforfacialexpressionanaly-sis,thesemanticgapbetweenlow-levelvisualfeaturesandhigh-leveluserperceptionremainsasachallengetocon-structaneffectiveautomaticfacialexpressionrecognitionsystem.Facialfeaturessufferahighdegreeofvariabilityduetoanumberoffactors,suchasdifferencesacrosspeople(arisingfromage,illness,gender,orrace,forexam-ple),growthorshavingofbeardsorfacialhair,make-up,blendingofseveralexpressions,andsuperpositionofspeech-relatedfacialdeformationontoaffectivedeforma-tion(Boureletal.,2002).Low-levelvisualfeaturesareusu-allyunstableduetothevariationofimagingconditions.Itisveryimportanttointroducethesemanticknowledgeintotheautomaticfacialexpressionrecognitionsystemsinordertoimprovetherecognitionrate.However,researchintoautomaticfacialexpressionrecognitionsystemscapa-bleofadaptingtheirknowledgeperiodicallyorcontinu-ouslyhasnotreceivedmuchattention.Toincorporateadaptationintherecognitionframeworkisafeasibleapproachtoimprovetherobustnessofthesystemunderadverseconditions.
88S.-C.Chengetal./ExpertSystemswithApplications33(2007)86–95
Inthispaperwepresentanautomaticfacialexpressionrecognitionsystemthatutilizesasemantic-basedlearningalgorithmusingtheanalyticalhierarchyprocess(AHP)(Min,1994;Saaty,1980).Ingeneral,humanemotionsarehardtoberepresentedusingonlythelow-levelvisualfea-turesduetothelackoffacialimageunderstandingmodels.Althoughtheeffectivenessoflow-levelfeaturesinauto-maticfacialexpressionrecognitionsystemshasbeenwidelystudied,thesuccessisshadowedbytheinnatediscrepancybetweenthemachineandhumanperceptiontotheimage.Thegapbetweenlow-levelvisualfeaturesandhigh-levelsemanticsshouldbebridgedinaproperwayinordertoconstructaseamlessautomaticfacialexpressionsystemsatisfyingtheuserperception.Forthispurpose,weusetheAHPtoprovideasystematicalwaytoevaluatethefit-nessofasemanticdescriptionforinterpretingtheemotionofafaceimage.Asemantic-basedlearningalgorithmisalsoproposedtoadapttheweightsoflow-levelvisualfea-turesforautomaticfacialexpressionrecognition.Theweightsarechosensuchthatthediscrepancybetweenthefacialexpressionrecognitionresultsobtainedintermsoflow-levelfeaturesandhigh-levelsemanticdescriptionissmall.Intherecognitionphase,onlythelow-levelfeaturesareusedtoclassifytheemotionofaninputfaceimage.Theproposedsemanticlearningschemeprovidesawaytobridgethegapbetweenthehigh-levelsemanticconceptandthelow-levelfeaturesforautomaticfacialexpressionrecognition.Experimentalresultsshowthattheperfor-manceoftheproposedmethodisexcellentwhenitiscom-paredwiththatoftraditionalfacialexpressionrecognitionmethods.
Theremainderofthispaperisorganizedasfollows:Sec-tion2ofthepaperdescribestheproposedsemantic-basedfacialrepresentationusingAHPindetail.Thentheadapta-tionschemeforchoosingtheweightsoflow-levelvisualfeaturesbyutilizingsemanticclusteringresultsispresentedinSection3.InSection4,someexperimentalresultsareshown.Finally,conclusionsaregiveninSection5.2.Semantic-basedfacerepresentationusinganalytichierarchyprocess
AHPproposedbySaaty(1980)usedasystematicalwaytosolvemulti-criteriapreferenceproblemsinvolvingqual-itativedataandwaswidelyappliedtoagreatdiversityofareas(Cheng,Chou,Yang,&Chang,2005;Lai,True-blood,&Wong,1999;Min,1994).Pairwisecomparisonsareusedinthisdecision-makingprocesstoformarecipro-calmatrixbytransformingqualitativedatatocrispratios,andthismakestheprocesssimpleandeasytohandle.Thereciprocalmatrixisthensolvedbyaweightingfindingmethodfordeterminingthecriteriaimportanceandalter-nativeperformance.TherationaleofchoosingAHP,despiteitscontroversyofrigidity,isthattheproblemtoassignthesemanticdescriptionstotheobjectsofanimagecanbeformulatedasamulti-criteriapreferenceproblem.AsshowninFig.2,thetwofaceimagesshouldbeclassified
Fig.2.Twofaceimagesof‘‘happiness’’emotionwithdifferentlow-levelvisualfeatures.
as‘‘happiness’’emotionusinghumanassessment,however,theouter-lipmovementforthetwofaceimagesismuchdif-ferent.Semanticknowledgeplaysanimportantroleinanautomaticfacialexpressionrecognitionsystemsuchthatthesystemfairlymeetsuserperception.ItisshowninourpreviousworkthattheAHPprovidesagoodwaytoevaluatethefitnessofasemanticdescriptionusedtorepre-sentanimageobject(Chengetal.,2005).2.1.AbriefreviewofAHP
TheprocessofAHPincludesthreestagesofproblem-solving:decomposition,comparativejudgments,andsyn-thesisofpriority.Thedecompositionstageaimsattheconstructionofahierarchicalnetworktorepresentadeci-sionproblem,withthetoplevelrepresentingoverallobjec-tivesandthelowerlevelsrepresentingcriteria,sub-criteria,andalternatives.Withcomparativejudgments,usersarerequestedtosetupacomparisonmatrixateachhierarchybycomparingpairsofcriteriaorsub-criteria.Ascaleofvaluesrangingfrom1(indifference)to9(extremeprefer-ence)isusedtoexpresstheuserspreference.Finally,inthesynthesisofprioritystage,eachcomparisonmatrixisthensolvedbyaneigenvectormethodfordeterminingthecriteriaimportanceandalternativeperformance.
ThefollowinglistprovidesabriefsummaryofallstepsinvolvedinAHPapplications:
1.Specifyaconcepthierarchyofinterrelateddecisioncri-teriatoformthedecisionhierarchy.
2.Foreachhierarchy,collectinputdatabyperformingapairwisecomparisonofthedecisioncriteria.
3.Estimatetherelativeweightingsofdecisioncriteriabyusinganeigenvectormethod.
4.Aggregatetherelativeweightsupthehierarchytoobtainacompositeweightwhichrepresentstherelativeimportanceofeachalternativeaccordingtothedeci-sion-maker’sassessment.OnemajoradvantageofAHPisthatitisapplicabletotheproblemofgroupdecision-making.Ingroupdecisionsetting,eachparticipantisrequiredtosetupthepreference
S.-C.Chengetal./ExpertSystemswithApplications33(2007)86–9589
ofeachalternativebyfollowingtheAHPmethodandalltheviewsoftheparticipantsareusedtoobtainanaverageweightingofeachalternative.
2.2.SemanticfacialexpressionrepresentationusingAHPWeviewafaceimageasacompoundobjectcontainingmultiplecomponentobjectswhicharethendescribedbyseveralsemanticdescriptionsaccordingtoathree-levelconcepthierarchy.Theconcepthierarchy,showninFig.3,isusedtoassignthesemanticstoafacialexpressionforaninputfaceimage.Accordingtothehierarchy,themethodforinvolvingthesemanticfacialexpressionrecog-nitiontoafacedatabasebyAHPisproposedinthisstudy.Forthesakeofillustrationconvenience,theclassificationhierarchyisabbreviatedasFEChierarchy.
TherearesevensubjectsinthetoplevelofFEChierar-chy.Eachtop-levelsubjectcorrespondingtoafacialexpressioncategoryisthendividedintoseveralsub-sub-jectscorrespondingtothepartsofthefaceimagecontrol-lingthehumanemotion,andeachsub-subjectisagaindecomposedintoseveralLevel3subjectscorrespondingtothefacialanimationparametersinMPEG-7areusedtodescribeafacialexpression.Apathfromtheroottoeachleafnodeformsasemanticdescription,andmultiplesemanticdescriptionsarepossibletointerpretafacialobjectaccordingtodifferentaspectsofusernotion.Aques-tionarisesnaturally:istheweightofeachpathcodeofanimageobjectequivalent?Theanswertotheproblemisofcourseno.Somesemanticdescriptionsareobviouslymoreimportantthanothersforaspecificimageobject.Forexample,thesemanticdescription‘‘happiness’’ismoreimportantthanthecodewiththesemanticdescription
GoalFacial ExpressionLevel 1NeutralHappinessSadnessAngerFearSurpriseDisgustLevel 2Level 3EyebrowsEyesMouthHigherLowerCloseHigherHighereyebrowseyebrowsright cornerWide openlower lipThe left isThe right isLowerLowerright cornerlower liphigher thanhigher thanNarrowthe right the left Close the HigherCircular left & openleft cornershapeInner sidesInner sidesthe right LowerNarrowis faris closeleft cornermouthClose the HigherLowerHigherLopsidedinner sidesinner sidesright & openthe left upper lipmouthOthersOthersupper lipLowerOthershapeFig.3.Theconcepthierarchyofthefacialexpressionforinterpretinganinputfaceimage.‘‘sadness’’fortheimageobjectinFig.2(b)accordingtotheauthors’opinion.
2.3.Semantic-basedfacialexpressionrepresentationAssumethepathcodesofthesemanticclassificationhierarchyarenumberedfrom1ton.GivenafaceimageI,thecontentoftheimageisrepresentedbyasemanticvec-torwhichisdefinedas
nI¼ðs1;s2;...;snÞ;X
si¼1;ð1Þ
i¼1
wheresidenotestheweightingoftheithpathcode.Althoughthevalueofnislarge,inanyvectorrepresentinganimage,thevastmajorityofthecomponentswillbezero.Thereasonisthatthenumberofobjectsperceivedinanimageisgenerallysmall.
Assigningweightstothepathcodesinasemanticvectorisacomplexprocess.Weightscouldbeautomaticallyassignedusingtheobjectrecognitiontechniques.However,thisproblemisfarfrombeingtotallysolved.Insteadofthat,inthispaper,weightsareassignedusingtheanalyticalhierarchyprocess.Notethatthenumericalcharacteristicofaweightlimitsthepossibilityofassigningitdirectlythroughhumanassessment.OnemajoradvantageofusingAHPinassigningweightstothepathcodesisthatusersareonlyrequiredtosettherelativeimportanceofseveralpairsofsemanticdescriptions,andthenthevaluesofweightsareautomaticallycalculated.
Thejudgmentoftheimportanceofonesemanticdescriptionoveranothercanbemadesubjectivelyandcon-vertedintoanumericalvalueusingascaleof1–9where1denotesequalimportanceand9denotesthehighestdegreeoffavoritism.Table1liststhepossiblejudgmentsandtheirrepresentativenumericalvalues.
Thenumericalvaluesrepresentingthejudgmentsofthepairwisecomparisonsarearrangedtoformareciprocalmatrixforfurthercalculations.Themaindiagonalofthematrixisalways1.Usersarerequiredtoadoptatop-downapproachintheirpairwisecomparisons.Givenanimage,thefirststepoftheclassificationprocessusingAHPistochoosethelargeclassificationcodesandevaluatetheirrel-ativeimportancebyperformingpairwisecomparisons.Forexample,Fig.4(a)containingafaceimageisthetargetof
Table1
PairwisecomparisonjudgmentsbetweensemanticdescriptionsAandBJudgmentValuesAisequallypreferredtoB
1AisequallytomoderatelypreferredoverB2AismoderatelypreferredoverB
3AismoderatelytostronglypreferredoverB4AisstronglypreferredtoB
5AisstronglytoverystronglypreferredoverB6AisverystronglypreferredoverB
7AisverystronglytoextremelypreferredoverB8Ais
extremelypreferredtoB
9
90S.-C.Chengetal./ExpertSystemswithApplications33(2007)86–95
Fig.4.Anexampleimagetobeclassified:(a)thefaceimage;(b)thecorrespondingreciprocalmatrixwithrespectto(a)forcalculatingthelocalweightingsofLevel1semanticdescriptionsininterpretingtheexpressionsoftheimage.
classification.Inthiscase,theimagecanbeclassifiedintothreeLevel1expressioncategories–‘‘Neutral’’N,‘‘Happi-ness’’H,and‘‘Surprise’’S.Fig.4(b)isthecorrespondingLevel1reciprocalmatrixM1forjudgingtherelativeimportanceofthethreesemanticdescriptions.TheentriesofM1canbedenotedas
NS2NwN=wNwS=wN
HwN=wHwH=wHwS=wH
SwN=wS
3
ð2Þ
M1¼
6
H4wH=wN
7;
wH=wS5wS=wS
codesusedtodescribethefaceimage.ItwouldbetoocumbersometoclassifyanimageusingAHPifthevalueofl·m·nisverylarge.Fortunately,thisproblemwouldnotoccurbecausemostofthefaceimagesdonothavealargeamountofpathcodestodescribethem.Mostofthemhaveatmost4–10pathcodesaccordingtoourexperience.Obviously,mostoftheweightingscorrespond-ingtosemanticdescriptionsinthesemanticvectorarezero.
3.Proposedsemantic-basedautomaticfacialexpressionrecognition
Theproposedautomaticfacialexpressionrecognitionsystemcanbeclassifiedintotwophases.Inthelearningphase,atrainingdatabaseisusedtoobtainthecorrectstructureusingthesemanticvectorsinlearningtheclassi-fier.ThesemanticvectorsofthetrainingsamplesobtainedfromAHPisfirstclusteredinordertochoosetheproperweightingsfortheextractedlow-levelvisualfeatures,whichareusedtocomputethesimilarityvaluebetweentwofaceimagesintherecognitionphase.Fig.5showstheblockdia-gramoftheproposedmethodandwillbediscussedindetaillater.
wherewN,wH,andwSaretherelativeimportancevalues(definedinTable1)forthethreesemanticdescriptionsN,H,and,S,respectively.Level1weightingsofthethreesemanticdescriptionsarethenobtainedfromM1.
Withoutloseofgenerality,letl,m,andnbethenumberofLevel1semanticdescriptions,thenumberofLevel2semanticdescriptionsforeachLevel1description,thenumberofLevel3semanticdescriptionforeachLevel2description,respectively.ForeachrowofLevel1recipro-calmatrixM1,wecandefineaweightingmeasurementas
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1
a1a1laii1
ri¼Â1ÂÁÁÁÂi1;i¼1;...;l;ð3Þ1a1a2alwherea1iistherelativeimportantvalueoftheithLevel1
semanticdescription.ThenLevel1weightingsaredeter-minedby
,
lX
11wi¼rir1i¼1;...;l:ð4Þj;
j¼1
FaceimagesSemantic vectorextraction using AHPLow-level featureextraction Semantic vectorclusteringk-NNsearchingusing semantic vectorWeighting adaptationforlow-level featuresSimilarly,wecancomputeLevel2weightingsw2i;j;j¼1;...;mfortheithLevel1semanticdescriptionandLevel3weightingsw3i;j;k;k¼1;...;nfortheithLevel1semanticdescriptionandthejthLevel2semanticdescription.Final-ly,theentrypofthesemanticvectordefinedinEq.(1)iscomputedaswp¼ðiÀ1ÞÂlþðjÀ1ÞÂmþk¼
w1i
Âw2i;j
Âw3i;j;k:
ð5Þ
k-NNsearchingusing low-levelfeaturesLearning phaseSemanticknowledgeRecognition phaseInput faceimagesLow-level featureextraction k-NNsearchingusing low-levelfeaturesFacial expressionrecognitionNotethatthenumberofreciprocalmatrixesfortheimageisl·m·nanditisactuallyequaltothenumberofpath
Fig.5.Blockdiagramoftheproposedsemantic-basedautomaticfacialexpressionrecognitionsystem.S.-C.Chengetal./ExpertSystemswithApplications33(2007)86–9591
3.1.Low-levelvisualfeatureextraction
Asmentionedabove,theconcepthierarchyshowninFig.3playstheroleinbridgingthegapbetweenthehigh-leveluserperceptionandlow-levelvisualfeaturesfortheproposedfacialexpressionrecognitionsystem.Actually,givenaninputfaceimage,thepossibilityoftheimagetobeclassifiedintoaLevel3subjectoftheconcepthierarchycanbemeasuredusingasetoflow-levelvisualfeatures.Forexample,onecanjudgewhetherthepositionsoftheeyebrowsofaninputfaceimageishigherthanthoseofthecorrespondingneutralfaceimagebymeasuringthepositionchangesofeyebrowsfromtheinputimagetotheneutralimage.
Inthestageoffeatureextraction,14characteristicpointsinafaceimagearefirstdetected,thentherelativefeaturedistances(a$l,n)amongthesepoints,showninFig.6,arecalculated.Notethatthesizesoftwooutputimagesofacameraforthesamefacearedifferentifweusetwodifferentfocuslengths,hence,thefeaturedistancesshouldbenormalizedinordertoeliminatetheeffectsofcameraoperations.Thedistancebetweentheinnercornersoftheeyesisusedtonormalizethefeaturedistanceduetothefactthattheinnercornersoftheeyesarerelativelysta-bletodetectusingimageprocessingtechniques.Thenor-malizedfeaturedistancesa0$l0arecomputedbya0¼an;b0¼bn;c0¼cdn;d0¼n;
e0¼e
n;
f0¼fn;g0¼gn;h0¼hin;i0¼n;
j0¼jn
;
k0¼k;l0ln¼n
:
ð6Þ
Finally,thenormalized12featuredistances,aslow-levelvisualfeatures,arefurthersubtractedfromthecorrespond-ingnormalizedfeaturedistancesofthecommonbaseim-ageofneutralexpression.
Fig.6.Theextractedlow-levelvisualfeatures:(a)thecharacteristicpointsinthefaceimage;(b)thedistancesamongthecharacteristicpointsfordescribingthemuscleactivities.
3.2.Semanticclusteringforfacialexpressions
Asmentionedabove,thesemanticinformationofeachtrainingfaceimagesisrepresentedasasemanticvector.However,theentriesofsemanticvectorsaremostlyzero.Thedimensionalityofsemanticvectorshouldbereducedinaproperwayinordertocompactthesemanticinforma-tion.Inthiswork,weusethewidelyusedK-meanscluster-ingtoclusterthesemanticvectorsofthetrainingdatabaseintoKsemanticclusters,whereeachofthemcarriesdiffer-entsemanticinformation.ThevalueofKwouldbe7(cor-respondingto‘‘neutral’’,‘‘happiness’’,‘‘sadness’’,‘‘fear’’,‘‘anger’’,‘‘surprise’’,and‘‘disgust’’expressioncategories)forautomaticfacialexpressionrecognitionifthenumberofsamplefaces,whichcoveralltypesoffacialexpressions,islargeenough.Ontheotherhand,thevalueofKcouldbelessthan7toreducetheeffectofthesmallsizeproblemthatincaseofthesmallsampletrainingdata,thesingularofwithin-classleadstoitsinversenotexisting.
Thesemanticdistanced(IA,IB)betweenwiththeðtwofaceimages
IB¼ðsðBÞðsemanticBÞðBÞ
vectorIA¼ðsAÞ;sðAÞðAÞ
12;...;sNÞand
1;s2;...;sNÞisdefinedas
dðA;BÞ¼XNhsðAÞðBÞðBÞÀsðAÞ
ii1Àsiþsi1i;ð7Þ
i¼1
whereNitemsðAÞisthetotalofsemanticdescriptions.ðBÞnumberi1Àsi
þsðBÞðAÞ
Thei1Àsiisactuallytheprobabil-ityofobjectsIAandIBdisagreeingwitheachotherontheithsemanticdescription.
Forthesakeofeasyreference,thesemanticclusteringusingtheK-meansalgorithmisdescribedbelow:
Step1:ForeachsemanticclusterSk,k=1,...,K,random
initialsemanticvectorsarechosenfortheclusterrepresentativesIk.
Step2:ForeverysemanticvectorI,thedifferenceiseval-uatedbetweenIandIk,k=1,...,K.IfdðI;IiÞ I1Mk k¼XMIm;ð8Þ km¼1whereIm,m=1,...,MkarethesemanticvectorsbelongingtoclusterSk. Step4:IfthenewIkisequaltotheoldonethenstop,else Step2. 3.3.Weightingadaptationforlow-levelvisualfeaturesOncethesemanticclustersareobtained,12low-levelvisualfeatures(cf.Fig.6)areextractedfromthecontentofeachdatabaseimage.Givenaqueryfaceimage,usersshouldnotberequiredtosettheweightingofeachfeaturetypeinordertorecognizesemanticallyrelevantexpression. 92S.-C.Chengetal./ExpertSystemswithApplications33(2007)86–95 Unfortunately,thelow-levelfeaturesandthehigh-levelsemanticconceptsdonothaveanintuitiverelationship,andhence,theproblemofweightingsettingisnotatrivialjobtosolve.Thislimitstherecognitionaccuracyoftheautomaticfacialexpressionrecognitionsystem.Inthispaper,weproposeamethodforautomaticallydeterminingtheweightingsofthelow-levelvisualfeaturesforeachsemanticcluster. Awinnow-likemistake-drivenlearningalgorithmisusedtolearnthediscriminantfunctiong(x)sfordefiningthedecisionboundariesofsemanticclustersintermsoflow-levelvisualfeatures.Theparadigmfollowedinthelit-eratureforlearningfromlabeledandunlabeleddataisbasedoninducingclassifiersfromthesemanticvectorsofthetrainingsamples.Theinducedclassifiersarethenusedtopredicttheclassifiersforpatternsintermsoflow-levelvisualfeaturessuchthattheclassificationresultsoftheclassifiersintermsoflow-levelvisualfeaturesagreeswiththoseoftheclassifiersintermsofhigh-levelsemanticvectors. LetFq¼ðfðqÞðqÞðqÞ turesofaninput1;fimage2;...;fq.GivenmÞbethelow-levelvisualfea-asemanticclusterSicon-tainingnimages,thenimagescanberankedbyqintermsofsemanticinformationusing(7)asfollowingordersetSi=(I1,I2,...,In).Theproposedweightingadaptationalgorithmaimsatchoosingweightingsforlow-levelvisualfeaturessuchthatSicanbeobtainedforansweringqintermsoflow-levelvisualfeatures.Moreconcretely,wecandefineacostfunctionJð~aÞtobeminimizedas#Jð~aÞ¼ XnjtðSÞðLÞ jÀtjj; ð9Þ j¼1 where~aistheturesandtðSÞweightingðLÞ vectorforthelow-levelvisualfea-jandtjaretheranksofthejthimagewithre-specttoaqueryimageintermsofhigh-levelsemanticvectorsandlow-levelvisualfeatures,respectively.Inaddi-tion,thedistancebetweenqandanimageIinSiisdefinedas Dðq;IÞ¼ XmajfðqÞðIÞ 2jÀfj:ð10Þj¼1 Theproposedlearningalgorithmusesasetofweaklearnerstoworkinasinglefeatureeachtime.ThevalueofakcorrespondingtothekthfeatureshoulddecreaseifJk(ak)>J(a)whereJk(ak)isthecostfunctionusingonlythekthfeatureofq.Theproposedlearningalgorithmisbrieflydescribedasfollows: Algorithm.Weightingadaptation Input:asemanticclusterSicontainingnimagesandthenumberofiterationsT. Output:aweightingvector~aforSi.Method: (1)Initializeweightsað0Þ fort=1,...,Tk¼1=m;k¼1;...;m.(2)Do: (3)ForeachimageqinSido (3.1)AnswerqandrankthenimagesinSiusing(7). (3.2)AnswerqandrankthenimagesinSiusing(10)and computethevalueofJð~aÞusing(9).(3.3)Dofork=1,...,m: (3.3.1)AnswerqandrankthenimagesinSiusing thekthlow-levelfeatureonlythevalueofJkðaðtÞ andcompute (3.3.2)UpdatetheweightkÞ. factoraðtÞ kasfollows: 8ðaðtÞ aðtþ1Þ