Eagle-Lab

 

Overview of BlogRecommender(Research of Recommender System in Blogspace)Project

Introduction
Blog, short for web-log, is a web page that serves as a publicly accessible personal journal. With the rapid growth during recent years, especially within the revolution of Web 2.0, blogs have become a prevailing type of personal media on the Internet.

However, current blog systems suffer from two main problems. Firstly, only a limited percent of bloggers (maintainer of a blog) insist on the updating of new posts after a period of time, which consequently causes a considerable waste of Internet space. In other words, it seems that the blog systems have lost their attraction to bloggers. Another problem is that blog service providers (BSPs) are still in their ways to an appropriate profit model. So, how can we help to maintain the attraction of blog systems to bloggers and meanwhile prevent the BSPs from bankruptcy? Our proposed project serves as a trial to solve these two problems.

 

Project Overview
Figure below shows the system architecture of our project.

         

Text Analysis Subsystem
In this subsystem, we analyze posts published on the blog page using several data mining techniques, including text classification, topic detection and opinion mining.
Blogger Analysis Subsystem
In this subsystem, we model each blogger based on the analysis result of his/her published posts and then his/her interests are quantified using the Vector Space Model. With the interest of each blogger in mind, we measure the similarity of different models and then dig out the so-called Blog Groups – groups of bloggers who have similar interests. In this subsystem, we regard the whole blogspace as a reduction of the society in reality. The mining of interests of individuals and Blog Groups in the blogspace is a simulation of the social network analysis in real world.
Recommendation Subsystem
Finally, the Recommendation Subsystem recommends information to bloggers and visitors considering the results of blogger modeling and text analysis of published posts. Based on the investigation of the motivations of bloggers and visitors and several potential profit strategies for BSPs, we choose Blog Groups, Info (which represents for news reports, personal reviews published on forums and blog, etc.) and Advertisements as the main recommending content in our current project.

Research Issues

1. Blogger Modeling – Characterize the interests of bloggers through the analysis of published posts and comments.
2. Blog Group Mining – Mine groups of bloggers who have similar interests based on the similarity measures among blogger models.
3. Opinion Mining – Analyze opinions expressed in blogs and extract the topics related. Advertisements are promoted according to bloggers'opinion polarity to the mentioned topics
.