Thursday, April 24, 2008

A Quick look at the Innovative Ideas Forum

Innovative Ideas Forum, National Library, 10 April, 2008.

Last week I attended the Innovative Ideas forum, for 2008, held at the NLA. The theme for this year was Web Archiving and there were a number of interesting and sometimes fun presentations. I thought I would give a very brief overview of the day, and provide a few URLs for people to follow up, if they wish.

The first talk was by Professor Gerard Goggin, from University of NSW, talking about the creation of history of the internet and the mobile phone. It is amazing to think that people are starting to write histories of these applications already. This presentation was highly academic, but it was one of the few which addressed the mobile phone issue. Kris Carpenter Negulescu discussed the Internet Archive in the US, discussing what they do and their holdings, currently running at 4 petabytes of information (2 x50 bytes or 10x15 bytes) , taking 6 million downloads per day. They hold 110 billion URLs, including 380 thousand books, images, moving images, open audio, and NASA images, and 1 million YouTube items. They are identifying the existing and emerging trends on the visible web, not including the hidden web and cyber scholarship .The hidden web refers to university libraries, university websites, government organisation websites, non- government organisation websites and academic and scholarly websites, which are not readily available on search engines such as Google. The URL for the Internet Archive is below.

Richard Walis gave an entertaining and professional presentation entitled Beyond Web 2.0 discussing the global semantic web platform, covering 2000-2010, which has developed a set of attributes eg, wikkis, RSS, blogs, social networking , and tagging- who knows what new developments there will be in the next few years? (The Semantic Web is an evolving extension of the World Wide Web in which the semantics of information and services on the web is defined, making it possible for the web to understand and satisfy the requests of people and machines to use the web content. From Wikipedia) .
He predicts there will be more interactive sites, participative sites and more mashups- machine talking to machine interaction. He noted that Google’s success over other search engines lay in its ability to use the network effect, site 1 points to site 2 etc, a hierarchy of sites and importance of sites which register more hits.
According to Walis, the way to break down silos on the web is to use the semantic web, which allows you to query across separate silos and use networks. This will enable massive social and economic shifts.

National Archives of Australia ex- staff member, David Pearson, and his colleague Douglas Elford, from the NLA , gave a presentation on the Mediapedia, which has been designed to identify various types of media and evaluate it, as to whether not to keep it. It’s usage is internal at this stage, for the use of the NLA digital preservation and collection areas. Classification systems they have used in its development have included Dublin Core & AACR 2 . They use the following classes _ genre, process, carrier, and name. For more information, see digitalpres@nal.gov.au

Stewart Wallace gave a brief overview of a project currently underway in Sydney, called The Dictionary of Sydney. This project is building a digital repository of text and multimedia related to Sydney’s history. The repository is designed to facilitate a variety of attributes including web, mobile, and RSS. In seeking the best method of connecting these resources to Sydney’s urban history, the dictionary project team is developing an accompanying semantic model of terms to create an extensible web of digital connections.
See http://www.dictionary/ of Sydney.org for more project details.

The next speaker, Julien Masanes, is the director of the European Archive Foundation . He spoke on the Next generation web archiving methods. The library holds his book, listed below, Web archiving.
He discussed the Living Web Archives project, which has been funded by the European Union. This project will carry web archiving into the next generation of the web, and will develop a range of services and technology for cultural institutions. It provides weekly snapshots of websites, simple client view archiving, abstract storage /identification, browsing and basic searching, and institution - centric access.
The archive looks at a site from the client’s view using the spider/crawler, capturing the site page by page. It stores items in containers, which have proved to be efficient and compressible. Searching is simple, by URL. The problem they have identified is that this paradigm doesn’t make a memory of the web, it is a frozen snapshot, and it doesn’t yet capture the interactive nature of the web. http://www.liwa-project.eu/

This brings us to the final presentation, which was a lot of fun and quite exciting.
Gordon Mohr from the Internet Archive addressed the challenges of future archiving of the web, looking at spam, malware, desktop Web 2.0, social networks, and virtual worlds. For a detailed analysis of the effects of malware, see the paper by Peter Gutmann , from the University of Auckland, who gloomily comments that as malware develops all you can do is “kiss your *** goodbye”… Google however does provide safe browsing lists which are reliable.

To attempt to archive social networks, Mohr suggests that to gain access to these previously open activities which are moving into private areas and friend networks, we could use an android harvester or android assistant to go in and ask to be a friend, and then be able to archive interaction with permission.

To archive virtual worlds is a challenge, particularly since these activities are replacing other communications (in his opinion), and are very popular, replacing popular and children’s literature. You could use the trusty android to go “in- world” into 2nd Life.
Another quandary flagged was the access conditions if you want to archive personal correspondence on interactive sites, eg Face Book or other social networking sites.
See the Internet Archive at : http://www.archive.org/index.php
To find out what some of this techno babble means, see the following books:

Masanes, Julien, Web archiving, 2006, 005.7 WEB
Jones, Dennis, How to do everything with the internet, 2001 004.678 JON
Henniger, Maureen, The hidden web , 2003 004.678 HEN

By, Beth Rogers, NAA

No comments:

Post a Comment