智能与分布计算实验室

Measuring Social Tag Confidence: Is It A Good or Bad Tag?

出版社:
  • 会议名称: The 12th International Conference on Web-Age Information Management (WAIM 2011)
  • 举办地点:Wuhan,China
  • 举办日期:September 14-16, 2011
  • 页数:94-105
摘要内容:

Social tagging is an increasingly popular way to describe latent semantic information of web resources and thus is widely used to improve the performance of information retrieval system. However, there also has been significant variance of the quality of social tags because they can be annotated by folks on the web freely. As a consequence, how to measure the quality of social tags (referred to as social tag confidence) becomes an important issue. In this paper, we propose a statistic model to measure the confidence of social tags by utilizing a combination of three attributes of a social tag: web resource, tag, and tagging user. In order to evaluate the effectiveness of our model, two experiments are performed with datasets crawled from del.icio.us. Experimental results show that our model has a better performance than other approaches with respect to Normalized Discounted Cumulated Gain (NDCG). In addition, F-1 measure of tagged web page clustering performance is also increased when our model is applied to filter the noisy social tags with low tag confidence.

关键词:
  • tag confidence;semantic similarity;clustering; information retrieval