The Heterogeneous P-Median Problem for Categorization Based Clustering

Psychometrika - Tập 77 - Trang 741-762 - 2012
Simon J. Blanchard1, Daniel Aloise2, Wayne S. DeSarbo3
1McDonough School of Business, Georgetown University, Washington, USA
2Department of Computer Engineering and Automation, Universidade Federal do Rio Grande do Norte, Natal, Brazil
3Department of Marketing, Smeal College of Business, Pennsylvania State University, University Park, USA

Tóm tắt

The p-median offers an alternative to centroid-based clustering algorithms for identifying unobserved categories. However, existing p-median formulations typically require data aggregation into a single proximity matrix, resulting in masked respondent heterogeneity. A proposed three-way formulation of the p-median problem explicitly considers heterogeneity by identifying groups of individual respondents that perceive similar category structures. Three proposed heuristics for the heterogeneous p-median (HPM) are developed and then illustrated in a consumer psychology context using a sample of undergraduate students who performed a sorting task of major U.S. retailers, as well as a through Monte Carlo analysis.

Tài liệu tham khảo

Addelman, S. (1962). Orthogonal main-effect plans for asymmetrical factorial experiments. Technometrics, 4(1), 21–46. Ashby, F.G., Maddox, T.W., & Lee, W.W. (1994). On the dangers of average across subjects when using multidimensional scaling or the similarity-choice model. Psychological Science, 5(3), 144–151. Baum, E.B. (1986). Toward practical ‘neural’ computation for combinatorial optimization problems. In J. Denker (Ed.), Neural networks for computing (pp. 53–64). New York: American Institute of Physics. Berman, O., & Drezner, Z. (2008). The p-median problem under uncertainty. European Journal of Operations Research, 189(1), 19–30. Bettman, J.R., Luce, M.F., & Payne, J.W. (1998). Constructive consumer choice processes. Journal of Consumer Research, 25(3), 187–217. Bijmolt, T.H.A., & Wedel, M. (1995). The effects of alternative methods of collecting similarity data for multidimensional scaling. International Journal of Research in Marketing, 12(4), 363–371. Blanchard, S.J., DeSarbo, W.S., Atalay, A.S., & Harmancioglu, N. (2012). Identifying consumer heterogeneity in unobserved categories. Marketing Letters, 23(1), 177–194. Boone, L.E., & Kurtz, D.L. (2009). Contemporary marketing. Mason: South-Western Educational Publishing. Brusco, M.J., Cradit, J.D., & Tashchian, A. (2003). Multicriterion clusterwise regression for joint segmentation settings: an application to customer value. Journal of Marketing Research, 40(2), 225–234. Brusco, M.J., & Cradit, J.D. (2005). ConPar: a method for identifying groups of concordant subject proximity matrices for subsequent multidimensional scaling analyses. Journal of Mathematical Psychology, 49(2), 142–154. Brusco, M.J., & Köhn, H.-F. (2008a). Comment on ‘Clustering by passing messages between data points’. Science, 319(5864), 726. Brusco, M.J., & Köhn, H.-F. (2008b). Optimal partitioning of a data set based on the p-median problem. Psychometrika, 73(1), 89–105. Brusco, M.J., & Köhn, H.-F. (2009). Exemplar-based clustering via simulated annealing. Psychometrika, 74(3), 457–475. Conn, A.R., Scheinberg, K., & Vincente, L.N. (2009). Introduction to derivative-free optimization. Philadelphia: SIAM. Coxon, A.P.M. (1999). Sorting data: collection and analysis. Thousand Oaks: Sage. Crainic, T.G., Gendreau, M., Hansen, P., & Mladenović, N. (2007). Cooperative parallel variable neighborhood search for the p-median. Journal of Heuristics, 10(3), 293–314. Daws, J.T. (1996). The analysis of free-sorting data: beyond pairwise co-occurrence. Journal of Classification, 13(1), 57–80. DeSarbo, W.S. (1982). GENNCLUS: new models for general nonhierarchical clustering analysis. Psychometrika, 47(4), 436–449. DeSarbo, W.S., & Carroll, J.D. (1985). Three-way metric unfolding via alternating weighted least squares. Psychometrika, 50(3), 275–300. DeSarbo, W.S., & Cron, W.L. (1988). A maximum likelihood methodology clusterwise linear regression. Journal of Classification, 5(2), 249–289. Farquhar, P.H., Han, J.Y., Herr, P.M., & Ijiri, Y. (1992). Strategies for leveraging master-brands. Marketing Research, 4(3), 32–43. Fazio, R.H., & Dunton, B.C. (1997). Categorization by race: the impact of automatic and controlled components of racial prejudice. Journal of Experimental Social Psychology, 33(5), 451–470. Forgy, E.W. (1965). Cluster analysis of multivariate data: efficiency vs. interpretability of classifications. Biometrics, 21(3), 768–769. Floudas, C.A. (1995). Non-linear and mixed-integer optimisation. New York: Oxford University Press. Furnas, G.W. (1989). Metric family portraits. Journal of Classification, 6(1), 7–52. Gigerenzer, G., & Todd, P.M. (1999). Simple heuristics that make us smart. New York: Oxford University Press. Griffin, A., & Hauser, J.R. (1993). The voice of the customer. Marketing Science, 12(1), 1–27. Hansen, P., Brimberg, J., Urosevic, D., & Mladenović, N. (2009). Solving large p-median clustering problems by primal-dual variable neighborhood search. Data Mining and Knowledge Discovery, 19(3), 351–375. Hansen, P., & Mladenović, N. (2001). Variable neighborhood search: principles and applications. European Journal of Operational Research, 130, 449–467. Hauser, J.R., Toubia, O., Evgeniou, T., Befurt, R., & Dzyabura, D. (2010). Disjunctions of conjunctions, cognitive simplicity, and consideration sets. Journal of Marketing Research, 47(3), 485–496. Helsen, K., & Green, P. (1991). A computational study of replicated clustering with an application to market segmentation. Decision Sciences, 22(5), 1124–1141. Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218. Isen, A.M. (1984). Toward understanding the role of affect in cognition. In R.S. Wyer Jr. & T.K. Srull (Eds.), Handbook of social cognition (pp. 179–236). Hillsdale: Lawrence Erlbaum. Ilog (2006). ILOG CPLEX 10.0 user’s manual. Jedidi, K., & DeSarbo, W.S. (1991). A stochastics multidimensional scaling methodology for the spatial representation of three-mode, three-way pick any/J data. Psychometrika, 56(3), 471–494. John, D.R., & Sujan, M. (1990). Age differences in product categorization. Journal of Consumer Research, 16(March), 452–460. Johnson, S.C. (1967). Hierarchical clustering schemes. Psychometrika, 32(3), 241–254. Kalamas, M., Cleveland, M., Laroche, M., & Laufer, R. (2006). The critical role of congruency in prototypical brand extensions. Journal of Strategic Marketing, 14(3), 193–210. Kariv, O., & Hakimi, S.L. (1979). An algorithmic approach to network location problems. II: the p-medians. SIAM Journal on Applied Mathematics, 37(3), 539–560. Kaufman, L., & Rousseeuw, P.J. (2005). Finding groups in data: an introduction to cluster analysis. New York: Wiley. Kelter, S., Cohen, R., Engel, D., List, G., & Stronher, H. (1977). The conceptual structure of aphasic and schizophrenic patients in a nonverbal sorting task. Journal of Psycholinguistic Research, 6(4), 279–303. Klastorin, T. (1985). The p-median problem for cluster analysis: a comparative test using the mixture model approach. Management Science, 31(1), 84–95. Köhn, H.-F., Steinley, D., & Brusco, M.J. (2010). The p-median as a tool for clustering psychological data. Psychological Methods, 15(1), 87–95. Lakey, B., & Cassady, P.B. (1990). Cognitive processes in perceived social support. Personality Processes and Individual Differences, 59(2), 337–343. Langley, P. (1996). Elements of machine learning. San Francisco: Morgan Kaufmann. Lee, M.D. (2001). Determining the dimensionality of mutli-dimensional scaling represetations for cognitive modeling. Journal of Mathematical Psychology, 45(1), 149–166. Love, B.C. (2003). Concept learning. In L. Nadel (Ed.), The encyclopedia of cognitive science (pp. 646–652). London: Nature Publishing Group. Maranzana, F.E. (1963). On the location of supply points to minimize transportation costs. IBM Systems Journal, 2(2), 129–135. Medin, D.L., & Schaffer, M.M. (1978). Context theory of classification learning. Psychological Review, 85(3), 207–238. Mervis, C.B., Catlin, J., & Rosch, E. (1976). Relationships among goodness-of-example, category norms, and word frequency. Bulletin of the Psychonomic Society, 7(3), 283–294. Miller, G.A. (1969). A psychological method to investigate verbal concepts. Journal of Mathematical Psychology, 6(2), 169–191. Mladenović, N., Brimberg, J., Hansen, P., & Moreno-Perez, J.A. (2007). The p-median problem: a survey of metaheuristic approaches. European Journal of Operational Research, 179(3), 927–939. Mladenović, N., & Hansen, P. (1997). Variable neighborhood search. Computers & Operations Research, 24(11), 1097–1100. Perkins, W.S. (1993). The effects of experience and education on the organization of marketing knowledge. Psychology & Marketing, 10(3), 169–183. Rao, V.R., & Katz, R. (1971). Alternative multidimensional scaling methods for large stimulus sets. Journal of Marketing Research, 8(4), 488–494. Reed, S.K. (1972). Pattern recognition and categorization. Cognitive Psychology, 3(3), 382–407. Reed, S.K. (1978). Category vs. item learning: implications for categorization models. Memory & Cognition, 6(6), 612–621. Rosch, E., & Mervis, C.B. (1975). Family resemblances: studies in the internal structure of categories. Cognitive Psychology, 7(4), 573–605. Rosch, E., Simpson, C., & Miller, R.S. (1976). Structural bases of typicality effects. Journal of Experimental Psychology. Human Perception and Performance, 2(4), 491–502. Ross, B.H., & Murphy, G.L. (1999). Food for thought: cross-classification and category organization in a complex real-world domain. Cognitive Psychology, 38(4), 495–554. Rousseeuw, P.J. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20(November), 53–65. Shugan, S.M. (1980). The cost of thinking. Journal of Consumer Research, 7(2), 99–111. Simon, H.A. (1955). A behavioral model of rational choice. Quarterly Journal of Economics, 69(1), 99–118. Smith, E.R., Fazio, R.H., & Cejka, M.A. (1996). Accessible attitudes influence categorization of multiply categorizable objects. Journal of Personality and Social Psychology, 71(5), 888–898. Sujan, M., & Dekleva, C. (1987). Product categorization and inference making: some implications for comparative advertising. Journal of Consumer Research, 14(3), 372–378. Takane, Y. (1980). Analysis of categorizing behavior using a quantification method. Behaviormetrika, 7(8), 75–86. Tucker, L.R., & Messick, S.J. (1963). An individual differences model for multidimensional scaling. Psychometrika, 28(4), 333–367. Urban, G.L., Hulland, J.S., & Weinberg, B.D. (1993). Premarket forecasting for new consumer durable goods: modeling categorization, elimination, and consideration phenomena. Journal of Marketing, 57(2), 47–63. Vapnik, V. (1998). Statistical learning theory. New York: Wiley. Ward, J.H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 38(301), 236–244. Wedel, M., & Kamakura, W.A. (2000). Market segmentation: conceptual and methodological foundations. Norwell: Kluwer Academic. Yang, C.C., & Yang, C.C. (2007). Separating latent classes by information criteria. Journal of Classification, 24(2), 183–203.