Scaling Pair-Wise Similarity-Based Algorithms in Tagging Spaces

Damir Vandic, Flavius Frasincar, Frederik Hogenboom

Research output: Chapter/Conference proceedingConference proceedingAcademicpeer-review

2 Citations (Scopus)


Users of Web tag spaces, e.g., Flickr, find it difficult to get adequate search results due to syntactic and semantic tag variations. In most approaches that address this problem, the cosine similarity between tags plays a major role. However, the use of this similarity introduces a scalability problem as the number of similarities that need to be computed grows quadratically with the number of tags. In this paper, we propose a novel algorithm that filters insignificant cosine similarities in linear time complexity with respect to the number of tags. Our approach shows a significant reduction in the number of calculations, which makes it possible to process larger tag data sets than ever before. To evaluate our approach, we used a data set containing 51 million pictures and 112 million tag annotations from Flickr.
Original languageEnglish
Title of host publicationTwelfth International Conference on Web Engineering (ICWE 2012)
EditorsM. Brambilla, T. Tokuda, R. Tolksdorf
Number of pages15
Publication statusPublished - 23 Jul 2012

Research programs

  • EUR ESE 32


Dive into the research topics of 'Scaling Pair-Wise Similarity-Based Algorithms in Tagging Spaces'. Together they form a unique fingerprint.

Cite this