• The data set contains 30,000 Twitter users' followee/follower relations and event timetables:

  • Followee links [followee.tar.gz]
    - Each filename corresponds to one Twitter userId
    - Each line of a file corresponds to a followee link

  • Follower links [follower.tar.gz]
    - Each filename corresponds to one Twitter userId
    - Each line of a file corresponds to a follower link

  • Raw Tweets [rawTweet.tar.gz]
    - Each line specifies a tweet (text, time) of the user with userId as its filename

  • Events [event.tar.gz]
    - The format is "EventType EventTime Content"
    - EventType = 0: Session starts, EventType = 1: Post a session, EventType = 2: Session ends
    - EventTime is the milliseconds from the last event. The first EventTime is the time from Feb. 1, 2010 (in milliseconds).
    - Content = null for EventType = 0 or 2, Content = tweet for EventType = 1

 

News

September 2011
Cuckoo system at Middleware 2011
Poster on OSN partitioning at Middleware 2011

June 2011
Graph sampling at SIMPLEX 2011

August 2010
Demo of Cuckoo system at SIGCOMM 2010
Poster on graph sampling at SIGCOMM 2010

May 2010
Position paper of Cuckoo system at HotPlanet 2010