One of the cornerstones of the .PW platform is the Wall – the real time aggregate of activities being done by an entity and its network. So while I am personally rather inactive on the various social networks (way too distracting), the recent announcement by FB on opening up their feed via activity streams led to some analysis and thought – here’s a summary.

 

The Premise

  • A user carries out several activities across multiple systems – posting items, joining forums, connecting to people, etc.
  • Each of these systems have their own way of capturing, storing and publishing this information.
  • Each system wants to puts various degrees of control on the usage of this information. Twitter makes it freely available, FB puts a wall-garden around it, others fall in between.
  • The user wants to be able to control who can see what information, and exercise his copyright on that information in terms of how it is consumed, used, persisted and further shared.

 

Challenges

  1. How does one capture the semantic richness of the various types of activities in a simple, precise way.
  2. How does one publish the stream in near real time without going into expensive polling. Why is this imp? Because on Jul 21, 2008 FriendFeed crawled Flickr ~2.7M times for a grand total of 6700 updates. HTTP was not made for Push, and Pull is a resource hog.
  3. How does a user continue to exercise his copyright on his content even as his feed becomes available in a machine readable way and published.

 

Solutions

  • Challenge #1 is being addressed thru the emerging activity streams standard. More on this in a minute.
  • For #2, activity streams uses Atom. So this can theoretically be layered over XMPP. (Any implementations, anyone?)
  • For #3, there are no straightforward solutions. Twitter makes everything public – as a user you can opt out. FB comes from the other side of the fence – the user opts in for sharing. Copyright enforcement in both cases is contractually enforced.

 

Activity Streams Standard

Take a look at the formats of the following public feeds:

1) http://api.flickr.com/services/feeds/photos_public.gne

<entry> <title>death valley 07a</title> <link rel="alternate" type="text/html" href="http://www.flickr.com/photos/willburn25/3488468568/"/> <id>tag:flickr.com,2005:/photo/3488468568</id> <published>2009-04-30T08:40:47Z</published> <updated>2009-04-30T08:40:47Z</updated> <dc:date.Taken>2009-04-30T02:40:47-08:00</dc:date.Taken> <content type="html"> &lt;p&gt;&lt;a href=&quot;http://www.flickr.com/people/willburn25/&quot;&gt;Shackleton, Jules&lt;/a&gt; posted a photo:&lt;/p&gt; &lt;p&gt;&lt;a href=&quot;http://www.flickr.com/photos/willburn25/3488468568/&quot; title=&quot;death valley 07a&quot;&gt;&lt;img src=&quot;http://farm4.static.flickr.com/3391/3488468568_b3eea508d4_m.jpg&quot; width=&quot;240&quot; height=&quot;92&quot; alt=&quot;death valley 07a&quot; /&gt;&lt;/a&gt;&lt;/p&gt; </content> <author> <name>Shackleton, Jules</name> <uri>http://www.flickr.com/people/willburn25/</uri> </author> <link rel="enclosure" type="image/jpeg" href="http://farm4.static.flickr.com/3391/3488468568_b3eea508d4_m.jpg" /> </entry>

 

2) http://picasaweb.google.com/data/feed/api/all 


<entry> <id>http://picasaweb.google.com/data/entry/api/user/isabellechedin/albumid/5284084403719516961/photoid/5330400059699013378</id> <published>2009-04-30T08:34:36.000Z</published> <updated>2009-04-30T08:34:36.000Z</updated> <categoryscheme='http://schemas.google.com/g/2005#kind' term='http://schemas.google.com/photos/2007#photo'/> <titletype='text'>DSC03419.JPG</title> <summarytype='text'/> <contenttype='image/jpeg' src='http://lh6.ggpht.com/_2pkj6pXKPwY/SflinNUUpwI/AAAAAAAABxs/gXHsFzjrylU/DSC03419.JPG'/> <linkrel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://picasaweb.google.com/data/feed/api/user/isabellechedin/albumid/5284084403719516961/photoid/5330400059699013378'/> <linkrel='alternate' type='text/html' href='http://picasaweb.google.com/isabellechedin/ExpatriationHongKong#5330400059699013378'/> <linkrel='http://schemas.google.com/photos/2007#canonical' type='text/html' href='http://picasaweb.google.com/lh/photo/FV74oIogEEchKANsA9dfLg'/> <linkrel='self' type='application/atom+xml' href='http://picasaweb.google.com/data/entry/api/user/isabellechedin/albumid/5284084403719516961/photoid/5330400059699013378'/> <linkrel='http://schemas.google.com/photos/2007#report' type='text/html' href='http://picasaweb.google.com/lh/reportAbuse?uname=isabellechedin&amp;aid=5284084403719516961&amp;iid=5330400059699013378'/> <author> <name>isa</name> <uri>http://picasaweb.google.com/isabellechedin</uri> <email>isabellechedin</email> <gphoto:user>isabellechedin</gphoto:user> <gphoto:nickname>isa</gphoto:nickname> <gphoto:thumbnail>http://lh5.ggpht.com/_2pkj6pXKPwY/AAAArGA7Css/AAAAAAAAAAA/lspX8vyux5o/s48-c/isabellechedin.jpg</gphoto:thumbnail> </author> <gphoto:id>5330400059699013378</gphoto:id> <gphoto:albumid>5284084403719516961</gphoto:albumid> <gphoto:access>public</gphoto:access> <gphoto:width>1600</gphoto:width> <gphoto:height>1200</gphoto:height> <gphoto:timestamp>1240920712000</gphoto:timestamp> <exif:tags> <exif:fstop>3.5</exif:fstop> <exif:make>SONY</exif:make> <exif:model>DSC-T100</exif:model> <exif:exposure>0.04</exif:exposure> <exif:flash>false</exif:flash> <exif:focallength>5.8</exif:focallength> <exif:iso>400</exif:iso> <exif:time>1240920712000</exif:time> </exif:tags> <media:group> <media:contenturl='http://lh6.ggpht.com/_2pkj6pXKPwY/SflinNUUpwI/AAAAAAAABxs/gXHsFzjrylU/DSC03419.JPG' height='1200' width='1600' type='image/jpeg' medium='image'/> <media:credit>isa</media:credit> <media:descriptiontype='plain'/> <media:thumbnailurl='http://lh6.ggpht.com/_2pkj6pXKPwY/SflinNUUpwI/AAAAAAAABxs/gXHsFzjrylU/s72/DSC03419.JPG' height='54' width='72'/> <media:thumbnailurl='http://lh6.ggpht.com/_2pkj6pXKPwY/SflinNUUpwI/AAAAAAAABxs/gXHsFzjrylU/s144/DSC03419.JPG' height='108' width='144'/> <media:thumbnailurl='http://lh6.ggpht.com/_2pkj6pXKPwY/SflinNUUpwI/AAAAAAAABxs/gXHsFzjrylU/s288/DSC03419.JPG' height='216' width='288'/> <media:titletype='plain'>DSC03419.JPG</media:title> </media:group> <gphoto:albumtitle>expatriation à Hong-Kong</gphoto:albumtitle> <gphoto:albumctitle>ExpatriationHongKong</gphoto:albumctitle> <gphoto:albumdesc/> <gphoto:location/> <gphoto:snippet/> <gphoto:snippettype>PHOTO_DESCRIPTION</gphoto:snippettype> <gphoto:truncated>0</gphoto:truncated></entry>

While both of them are about public photos, the formats are vastly different since they come from two different providers. So a potential system that wants to aggregate photo updates across multiple providers needs to understand each vendor’s format separately. Efforts like Yahoo’s Media RSS extensions seek to address this.  (The picasa feed above uses the Yahoo extensions but the Flickr feed does not, which is strange considering it is a Yahoo property). Another example is Apple’s iTunes RSS extensions. However, these are RSS extensions, and what we really need in this space is an Atom extension. (why? see http://blog.unto.net/work/on-rss-and-atom/). An example is http://martin.atkins.me.uk/specs/atommedia.

However, this is only about photos which is just one of the activities. When we start considering the possible list of activities, the problem scope becomes extremely large. This is where the Activity Streams effort comes in. Consider the following activity: “Vineet posted new photographs to Agastya on Facebook 6 hours ago.” This can be broken down as:

  • Vineet = actor
  • posted = verb
  • new photographs = item / article
  • Agastya = social object (album here)
  • Facebook = site
  • 6 hours ago = timestamp

Other possible fields here can be:

  • User Agent / tool (example Thwirl)
  • Location
  • Mood
  • Title
  • Summary
  • Detail
  • Link
  • Verb collection
  • Actor collection
  • Object collection
  • Comments (points to another object)
  • The producer of the content may define the default viewing mechanism which can be used by a consumer

To capture the above, the following draft specs are currently in place.

  1. Atom Activity Extensions defines Actor, Verbs, Objects, Title, Summary, Detail, Link, etc.
  2. Atom Activity Base Schema defines the semantics of various Verbs and Objects, Location and Mood
  3. Atom Media Extensions defines how typical media – videos, photographs, etc. should be described. So in the above example, if we wanted to enclose the actual photographs of Agastya, we would need to describe the actual link, the preview thumbnail, image type, height, width, etc.

Note that this is work in progress and if you read the drafts, the gaps are fairly obvious. The big ones (I think) are:

  • As activities get republished, a downstream consumer may end up dealing with duplicates. This can be addressed by a combination of a URI + per item identifier
  • Some activities may be in response to other activities (classic case being video responses on YouTube) – the relationship would need to be captured
  • Instead of a single actor-verb-object, we may have collections of actors, verbs and objects. Examples being multiple actors on a single object (wiki editing) or single actor, multiple objects (uploading multiple pictures) or combinations.

 

Activity Streams Implementation

1. MySpace: http://wiki.developer.myspace.com/index.php?title=Standards_for_Activity_Streams is compliant to the current draft specs. Also, their documentation is simple to read, understand and use.

2. Facebook: http://wiki.developers.facebook.com/index.php/Using_Activity_Streams. For one, the stream only consists of user generated content and not app generated content. Second, they have a model where you need to prompt the user for permission to access the stream. This has two repercussions:

  • This requires the user to be logged on to Facebook
  • You can only show the user his own data. So if I import the FB feed on Friendfeed, the subscribers to my Friendfeed feed would not get to see the FB feed. This could have been possible had apps like FF could have persisted FB data, but FB TOS prohibit caching data for more than 24 hours.

Note that the Activity Stream standard is just one way of accessing the FB stream. They also provide a XML/HTTP and a FQL API. See Using the Open Stream API for details. The restrictions however stay in place.

So in summary, while the Activity Stream standard effort is quite exciting, the
current FB implementation is not. All one can use it for is building clients
which show the user her own data. The real value would be in letting apps
re-publish the data without violating the privacy needs of the user. As of now,
the wall garden is very much in place.

 

Further Reading

  • activity streams project
  • http://wiki.diso-project.org/activity-streams
  • http://wiki.developers.facebook.com/index.php/Using_Activity_Streams
  • http://wiki.developer.myspace.com/index.php?title=Standards_for_Activity_Streams
  • http://groups.google.com/group/activity-streams

    blog comments powered by Disqus