In my last post I talked about the aspects and the issues that arise in the preservation of Web 2.0 content. Now I want just to point out how are this preservation issues outstanding for some components of Web 2.0:
Blogs: are one of most common piece associated with the new version of the web. The conversational content of many of them can sometimes be identified as non-valuable; that is not worthy to bother with archiving its content. For the rest, however, the short update cycle (the speed at which new content is added), the numerous external references pose some of the difficulties I talked about. Then, what should be preserved from the blog other then the posts: the comments, the embedded resources? Someone noted that blogs tend to be rather individual rather than organizational, hence it’s rather difficult to archive them in a way the content is easy searchable and accessible.
Wikis: most of their content is what we called hidden web. The text and media content are stored in inaccessible databases, while the wiki experience is found as web metadata on the web server. However the inbuilt history function from most of them is an acceptable compromise for now.
Media sharing: the content is again hidden web and most of the streaming technologies used for live media are either proprietary or use Digital Rights Management to hinder any attempt of downloading the content.
Data mash-ups: assemble live content from various web sites that publish their APIs. Therefore, most of the content is also hidden web and therefore not readily accessible. Since the look & feel is part of experience, the overall preservation is even more difficult to accomplish.
Social networks: contain their users’ personal space. Though some may involve some look & feel elements which are part of the experience, this is not always the case. However, most contain private information of their users so major obstacle lies within the privacy an intellectual property.