
Tytuł artykułu

Document Clustering - Concepts, Metrics and Algorithms

Tytuł czasopisma

International Journal of Electronics and Telecommunications




vol. 57


No 3


Wydział PAN

Nauki Techniczne


Polish Academy of Sciences Committee of Electronics and Telecommunications




DOI: 10.2478/v10177-011-0036-5 ; eISSN 2300-1933 (since 2013) ; ISSN 2081-8491 (until 2012)


International Journal of Electronics and Telecommunications; 2011; vol. 57; No 3


Labrou Y. (1999), Yahoo! as an ontology: using yahoo! categories to describe documents, null, 180. ; Jain A. (1999), Data clustering: a review, ACM Comput. Surv, 31, 264, ; Cutting D. (1992), Scatter/gather: a cluster-based approach to browsing large document collections, null, 318. ; Salton G. (1975), A vector space model for automatic indexing, Commun. ACM, 18, 613, ; G. Salton and C. Buckley, "Term weighting approaches in automatic text retrieval," Cornell University, Ithaca, NY, USA, Tech. Rep., 1987. ; Wong S. (1987), On modeling of information retrieval concepts in vector spaces, ACM Trans. Database Syst, 12, 299, ; Tai X. (2000), Improvement of vector space information retrieval model based on supervised learning, null, 69. ; (1988), Automatic text processing. ; Zhao Y. (2004), Empirical and theoretical comparisons of selected criterion functions for document clustering, Mach. Learn, 55, 311, ; Zeng H. (2004), Learning to cluster web search results, null, 210. ; Olson C. (1995), Parallel algorithms for hierarchical clustering, Parallel Comput, 21. ; C. van Rijsbergen (1979), Information Retrieval. ; Makhoul J. (1999), Performance measures for information extraction, null, 249. ; El-Hamdouchi A. (1989), Comparison of hierarchic agglomerative clustering methods for document retrieval, The Computer Journal, 32, 220, ; M. Steinbach, G. Karypis, and V. Kumar, "A comparison of document clustering techniques," 2000. [Online]. Available: <a target="_blank" href=''></a> ; Day W. (1984), Efficient algorithms for agglomerative hierarchical clustering methods, Journal of Classification, 1, 7, ; Wilkin G. (2008), A practical comparison of two k-means clustering algorithms, BMC Bioinformatics, 9. ; Wu J. (2009), Adapting the right measures for k-means clustering, null, 877. ; Chiang M. (2007), Progress in Artificial Intelligence, 4874, 395, ; Arthur D. (2007), k-means++: the advantages of careful seeding, null, 1027. ; Maitra R. (2010), A systematic evaluation of different methods for initializing the k-means clustering algorithm, IEEE Transactions on Knowledge and Data Engineering. ; Milligan G. (1980), The validation of four ultrametric clustering algorithms, Pattern Recognition, 12, 2, 41, ; Bradley P. (1998), Refining initial points for k-means clustering, null, 91. ; Mirkin B. (2005), Clustering for Data Mining: A Data Recovery Approach, ; Fisher D. (1987), Knowledge acquisition via incremental conceptual clustering, Mach. Learn, 2, 139, ; Cheeseman P. (1996), Menlo Park, CA, USA: American Association for Artificial Intelligence, 153. ; Savaresi S. (2000), Choosing the cluster to split in bisecting divisive clustering algorithms, null. ; Meila M. (2001), An experimental comparison of model-based clustering methods, Mach. Learn, 42, 9, ; Karypis G. (1999), Chameleon: Hierarchical clustering using dynamic modeling, Computer, 32, 68, ; Boley D. (1998), Principal direction divisive partitioning, Data Min. Knowl. Discov, 2, 325, ; Zha H. (2001), Bipartite graph partitioning and data clustering, null, 25. ; Zha C. (2001), Spectral relaxation for k-means clustering, null, 1057. ; Dhillon I. (2001), Concept decompositions for large sparse text data using clustering, Mach. Learn, 42, 143, ; Zamir O. (1997), Fast and intuitive clustering of web documents, null, 287. ; Dash M. (2004), Efficient parallel hierarchical clustering, null. ; Song Y. (2008), Parallel spectral clustering, Machine Learning and Knowledge Discovery in Databases, 374, ; Y. Liu, J. Mostafa, and W. Ke, "A fast online clustering algorithm for scatter/gather browsing," 2007. ; Cutting D. (1993), Constant interactiontime scatter/gather browsing of very large document collections, null, 126.