Topic Modeling for Personalized Recommendation of Volatile Items
16 pages
English

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

Topic Modeling for Personalized Recommendation of Volatile Items

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris
Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus
16 pages
English
Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

Description

Niveau: Supérieur
Topic Modeling for Personalized Recommendation of Volatile Items Maks Ovsjanikov1 and Ye Chen2 1 Stanford University, 2 Microsoft Corporation, Abstract. One of the major strengths of probabilistic topic modeling is the ability to reveal hidden relations via the analysis of co-occurrence patterns on dyadic observations, such as document-term pairs. However, in many practical settings, the extreme sparsity and volatility of co- occurrence patterns within the data, when the majority of terms appear in a single document, limits the applicability of topic models. In this paper, we propose an efficient topic modeling framework in the presence of volatile dyadic observations when direct topic modeling is infeasible. We show both theoretically and empirically that often-available unstruc- tured and semantically-rich meta-data can serve as a link between dyadic sets, and can allow accurate and efficient inference. Our approach is gen- eral and can work with most latent variable models, which rely on stable dyadic data, such as pLSI, LDA, and GaP. Using transactional data from a major e-commerce site, we demonstrate the effectiveness as well as the applicability of our method in a personalized recommendation system for volatile items. Our experiments show that the proposed learning method outperforms the traditional LDA by capturing more persistent relations between dyadic sets of wide and practical significance.

  • volatile items

  • item

  • standard technique

  • topic modeling

  • semantically-rich meta

  • can work

  • query pairs

  • generative model


Sujets

Informations

Publié par
Nombre de lectures 19
Langue English

Extrait

TopicModelingforPersonalizedRecommendationofVolatileItemsMaksOvsjanikov1andYeChen21StanfordUniversity,maks@stanford.edu2MicrosoftCorporation,yec@microsoft.comAbstract.Oneofthemajorstrengthsofprobabilistictopicmodelingistheabilitytorevealhiddenrelationsviatheanalysisofco-occurrencepatternsondyadicobservations,suchasdocument-termpairs.However,inmanypracticalsettings,theextremesparsityandvolatilityofco-occurrencepatternswithinthedata,whenthemajorityoftermsappearinasingledocument,limitstheapplicabilityoftopicmodels.Inthispaper,weproposeanefficienttopicmodelingframeworkinthepresenceofvolatiledyadicobservationswhendirecttopicmodelingisinfeasible.Weshowboththeoreticallyandempiricallythatoften-availableunstruc-turedandsemantically-richmeta-datacanserveasalinkbetweendyadicsets,andcanallowaccurateandefficientinference.Ourapproachisgen-eralandcanworkwithmostlatentvariablemodels,whichrelyonstabledyadicdata,suchaspLSI,LDA,andGaP.Usingtransactionaldatafromamajore-commercesite,wedemonstratetheeffectivenessaswellastheapplicabilityofourmethodinapersonalizedrecommendationsystemforvolatileitems.OurexperimentsshowthattheproposedlearningmethodoutperformsthetraditionalLDAbycapturingmorepersistentrelationsbetweendyadicsetsofwideandpracticalsignificance.1IntroductionProbabilistictopicmodelshaveemergedasanatural,statisticallysoundmethodforinferringhiddensemanticrelationsbetweentermsinlargecollectionsofdocu-ments,e.g.,[4,12,11].Mosttopic-basedmodelsstartbyassumingthateachterminagivendocumentisgeneratedfromahiddentopic,andadocumentcanbecharacterizedasaprobabilitydistributionoverthesetoftopics.Thus,learninghighlevelsemanticrelationsbetweendocumentsandtermscanbereducedtolearningthetopicmodelsfromalargecorpusofdocuments.Atthecoreofmanylearningmethodsfortopic-basedmodelsliestheideathattermsthatoftenoccurtogetherarelikelytobeexplainedbythesametopic.Thisisapowerfulideathathasbeensuccessfullyappliedinawiderangeoffieldsincludingcomputervision,e.g.,[9],andshaperetrieval[14].Oneofthelimitationsofdirecttopicmodeling,however,isthatco-occurrencepatternscanbeverysparseandareoftenvolatile.Forexample,termscannotbereadilysubstitutedbyimagesanddocumentsbywebsitesintheLDAmodel,sincethevastmajorityofimageswillonlyoccurinasinglewebsite.Nevertheless,
  • Univers Univers
  • Ebooks Ebooks
  • Livres audio Livres audio
  • Presse Presse
  • Podcasts Podcasts
  • BD BD
  • Documents Documents