Advancing P2P technology - P2P Foundation2024-03-28T22:14:40Zhttp://p2pfoundation.ning.com/forum/topics/advancing-p2p-technology?commentId=2003008%3AComment%3A17563&feed=yes&xn_auth=noIn an email, Adam said:
How…tag:p2pfoundation.ning.com,2010-06-03:2003008:Comment:175632010-06-03T10:53:33.000ZSepp Hasslbergerhttp://p2pfoundation.ning.com/profile/Sepp
In an email, Adam said:<br />
<br />
<i>How do we archive these things for ourselves and for future historians? We can design this into distributed systems to provide information search and retrival for recent data, with distributed computation compressing that into medium term memory (large compressed files available for segmented download) then those automatically get sent to archivists, potentially university history departments around the world.</i><br />
<br />
and<br />
<br />
<i>Specific to archiving of and by distributed…</i>
In an email, Adam said:<br />
<br />
<i>How do we archive these things for ourselves and for future historians? We can design this into distributed systems to provide information search and retrival for recent data, with distributed computation compressing that into medium term memory (large compressed files available for segmented download) then those automatically get sent to archivists, potentially university history departments around the world.</i><br />
<br />
and<br />
<br />
<i>Specific to archiving of and by distributed systems, my initial thoughts on the implementation related details are: multicore compression, distributed compression, segmented downloading of large archive files, search and retrieval of them and potentially of items in those files, and routing these over time to repositories such as those at history departments.</i><br />
<br />
My view is that archiving a conversation that travels on normally short distance lines and that is in the context of p2p type software, will not be a straightforward task like archiving web pages in the internet archive's wayback machine.<br />
<br />
Here is what I said:<br />
<br />
I believe we should distinguish between a general archiving of the data that have been made available on the net and the archiving that will eventually have to be thought out for all the conversations generated on a future p2p type parallel or sub net.<br />
<br />
For the www itself, yes, it would be great to have, for instance, all paper and magazine morgue files available, not just to historians but as a generally searchable data base that provides a record of all print publications back to as far as we can possibly make it. In a wide sense, this would include book initiatives such as Google's scanning of orphan and out of print works and making those available. This alone would be a huge and very worthwhile project.<br />
<br />
The Internet Archive seems to be doing a good job of archiving the conversations that happen on the net today, articles on websites and discussions. Social networking sites have a lot of valuable conversation going on, and I am not so sure this is being included in what's currently archived. After all, here we have a huge number of communications, many of them frivolous but also many serious ones, the totality of which is not visible anywhere except on facebook's own servers. Each user only gets what pertains to their friends, and only a selection of what is considered more interesting, but that user-specific rendering then vanishes into a long tail of conversations that are no longer easily accessible much less searchable.<br />
<br />
On the p2p side, a similar problem will present itself, but with the caveat that we won't even have a central server to refer back to, or to spider for the record of what is happening. As you say, distributed spidering will probably be needed, but what about the locale of storage? History departments will be interested in having access to that data, but they will not be necessarily the best custodians for the raw stuff.<br />
<br />
Tentatively, I see a way of handling this by leveraging the computing and storage power our own computers have. Take email conversations, for instance. People have a lot of data in those conversations, including attachments, which are being kept by email clients on some of our computers. Then each one of us has some sort of storage system for interesting files on their own computers. With a p2p direct net developing, there is going to be more and more of that type of data sitting on our computers, partially replicated on other computers, but nowhere collected into a coherent story or centrally archived for later retrieval and historical interest study.<br />
<br />
I am thinking people should be made aware of this and a discussion started, with the aim of preserving the data on people's hard disks for historical evaluation. I would have never said such a thing a decade ago, but ... our notion of privacy is changing and, after some time has passed, all that data does seem much less personal. For instance, we are happy to have the correspondence archives of the scientists of the past, which at the time they might not have opened for public access. Only with their passing have the archives become accessible to researchers. Something very similar should probably be envisioned for people's hard drives. Perhaps a "donate your old drive" or "donate your backup to historical archives" type campaign could help in this, or it could be done by spidering and archiving.<br />
<br />
My point: We are moving away from a server based model (where archiving was fairly straightforward) towards a more user-based model, where everyone keeps their own data of interactions (conversations) with others and associated connected media files. We will therefore need to re-think the way we archive those conversations for future generations. Technology advances will help in unexpected ways, but perhaps the basic model should be in place before p2p really starts taking off.