used by several million users. In this work, we focus on
LiveJournal and derivative services such as
GreatestJournal, DeadJournal, and JournalFen based on
the same open-source server code. At the time of this
writing, there are over 8.5 million LiveJournal accounts,
of which over 2.5 million are active; these are either user
accounts, associated with one or a small number of
individuals, or communities (each a forum for multiple
users similar to MSN Communities or Yahoo! Groups).
The embeddability, syndication, OpenID integration, and
metadata features of LiveJournal make it a rich source of
structured data about these users and communities, and
the interrelationships among them.
Friendship in LiveJournal is an asymmetric relation
between two accounts and which can be represented as an
edge in a directed graph. Either the start vertex u or the
end vertex v may denote either a user account or a
community account, though community-to-community
links are not used. Table 1 lists the categories of links
and specific link types. Community-to-user links are of
three independent types: “member”, “posting access”, and
“maintainer” (post and membership moderation). Of
these relations, only membership is requested by users or
invited by maintainers; the rest are privileges granted by
maintainers.
Table 1. Types of links in the blog service LiveJournal.
Start End Link Denotes
User User Trust or friendship
User Community Readership or
subscribership
Community User Membership, posting
access, maintainer
Community Community Obsolete
Thus, a reciprocal link between a user and a community
means that the user subscribes to the community and is an
accepted member of the community. Subscriptions are
listed in the “Friends: Communities” section of the user’s
page and in a list titled “Watched By” in the community’s
page. Links from user u to v are listed in the “Friends”
list of u and in an optionally displayed “Friends Of” list of
v. This list can be partitioned into reciprocal and non-
reciprocal sublists for a user u:
Mutual Friends: { v | (v, u) ∈E ∧ (u, v) ∈ E }
Also Friend Of: { v | (v, u) ∈ E ∧ (u, v) ∉ E }
The social network for the LiveJournal user base consists
of many connected components. There are a few source
vertices corresponding to users that link to friends but
have no reciprocated friendships. Many of these are
aggregator accounts created for reading RSS or other
users’ blog entries. Additionally, there are sink vertices
corresponding to users or communities watched by others,
but who have named no friends. Some of these are
channels for announcement or dissemination of creative
work.
2.2 Collaborative, Structural, and Content-Based
Link Recommendation
We now discuss the link recommendation problem, the
available data, and some previous approaches. One social
function of many weblog services is to introduce people
to new friends and communities and to provide content
aggregators and communication media among people who
know each other. The basis for these introductions is
often the list of interests reported by a user or community
maintainer.
LiveJournal collects all of the abovementioned
information on the social network structure, along with
user interests, self-reported personal information, and
descriptive statistics about posting history in a user
information page for each account. We seek to mine this
data in order to provide improved link recommendations.
Our hypothesis is that recommendations based only on
shared interests can be greatly improved using
information about the graph structure. For instance, local
structural features such as whether a link already exists
from the candidate friend to the recommender system
user, how many mutual friends of the user and candidate
there are, and the degree of user and candidate all provide
some supporting evidence for a link recommendation.
Additionally, search-based graph analysis can yield
information about the shortest alternate path in
friendships from the user to the candidate, and vice versa.
The long-term goal of this research is to explore ways in
which contextual information can be combined with
graph structure or descriptive graph features to obtain an
enriched model for making weblog-based link
recommendations. Examples of this information include
user interests, preferences and constraints (e.g., desired
ranges or limits for number of friends). Mechanisms for
combining structural and contextual information include
filtering candidate sets by graph proximity, counting
number of mutual friends sharing certain interests,
normalizing weights of shared interests based on dynamic
itemset frequency within a certain graph radius.
Initially, we consider a predominantly collaborative and
structural approach to recommendation: we hypothesize
that users are likely to prefer links similar to extant ones
and therefore generate candidates in this paper from
within a specified radius in the social network. This is a
form of collaboration in that the paths are formed by other
users’ choices of friends. Statistics such as the indegree
of a vertex, denoting length of the users “Frends Of” list,
are similarly collaborative in nature. We also use counts
of mutual interests and mutual friends (structural
recommendation). In the next section, we discuss the
acquisition of data and experiment design for this
recommendation problem.