Missed the live Bitcoin & Beyond Summit? Watch the replay here

Why archiving the internet is vital and how a decentralized web can help | Ep. 6

Why archiving the internet is vital for the future, how a decentralized web can help, and why looking at history is a good way to predict the future

Powered by the Filecoin Foundation, The Future Rules is hosted by Forkast.News Editor-in-Chief Angie Lau, alongside top legal mind in blockchain and Filecoin Foundation Board Chair, Marta Belcher. Together with some of the most renowned names in the industry as their special guests they dive into the future and the ethical issues that technology will raise, and how to address them today before they determine our tomorrow. From NFTs, to CBDCs and beyond, the team explores issues of civil liberties, law, compliance, human rights, and regulation that will shape the world to come.

Find more episodes in the podcast series: The Future Rules

In this episode, Brewster Kahle, the Founder and Digital Librarian of the Internet Archive, takes a look at how we can preserve the internet for future users by making use of the decentralized web. He takes a look at how past experiences can help guide a better future, goes on to explain how he envisions the world being changed by not only Web 3.0, but Web 4.0, 5.0 and 6.0 to follow, and talks about the work of the Internet Archive.

Highlights

The vision behind the Internet Archive: “The Internet Archive started by trying to archive the internet, but really the goal was to be in an archive on the internet and build towards being the library of everything. How can we go and build the digital library of Alexandria? Can we make it so that there's universal access to all knowledge? That's the vision of the internet that I signed on to decades ago, to try to help build that. And with the technologies we've got now, we can actually make that come true.” (Brewster Kahle)

The most desirable properties of the decentralized web: “Well, we want a web that's private. We want a web that's robust. We want a web where you can actually make some money by publishing on the net, not just by having advertising systems or going to some third party. And we want it to still be fun and frolicking and build something new and different. So we put out a call in 2016 to build a decentralized web.” (Brewster Kahle)

The Internet Archive and Filecoin are preserving humanity’s most important information: “The thing that is so cool about the decentralized web is the ability to preserve humanity's most important information. I think how do you preserve humanity's most important information is a really interesting question, and the Internet Archive has been doing such amazing work on preserving the internet and thinking about not just the ways that you can preserve the internet today, but also the ways that you can preserve humanity's most important information in the future. And for me and for Filecoin, what that means is creating a decentralized storage network where you don't have to rely on just one or two big players to store effectively all of the web. And instead, creating the ability to have a system where many or some nodes can fail and the whole system will still be robust against failure and where you can take information and spread it out among a lot of people's computers, instead of just computers and hardware owned by a few companies. (Marta Belcher)

The issue with social media feeds: “But now we have the age of feeds, I love that term, it's just so gross - It's like eat your feed. All right, you're sucking on your feed. It's this sort of continuous dribble that sort of spews at us, which is interactive. It's exciting, it's mesmerizing, but it's very difficult to refer back to things. Pages you thought of as something that you could refer to and was going to last a long time. So you invested in them. Feeds, you are just babbling. And so we've had this major shift.” (Brewster Kahle)

The most pressing considerations for building Web 3.0: “Watch out for centralized points of control - monopoly - but monopoly in lots of different mechanisms and ways. Those are basically individual organizations or protocols or approaches that seem to make it so that other things aren't allowed. So how do we go and keep a system that is flexible enough to bend around these sorts of systems? And I wonder if these large-scale corporations that we've built are really going to serve us all that well. They're now starting to be larger than governments.

So the new big entities are, I think, corporations. Maybe it was the church 500 years ago, then it was governments. And are we really kind of going into an era when corporations and corporate-think are going to dominate? That's something that I'd like to bring to mind as we're building these systems that to try to make it so that there's room for new players to join in without having to get permission from the old guys. We know how that ends up, it's not good.” (Brewster Kahle)

Transcript:

Angie: Welcome to the Future Rules. I'm Angie Lau, Editor-in-Chief and Founder of Forkast.News.

Marta: And I'm Marta Belcher, Chair of the Filecoin Foundation. In this podcast, we dive into the future of technology and the legal and ethical issues that it raises. So today we're going to start talking about archiving the internet.

Angie: That's a lot to archive. But why is it so vital to do that and to make access free for all? Particularly in a world where it's all too easy to find misinformation. And can a decentralized web actually help with such projects? There's no one better to ask than Brewster Kahle.

Marta: That's right. Brewster is the Founder and Digital Librarian of the Internet Archive. I would love it, Brewster, if you could tell us about the Internet Archive and your incredible mission and the amazing work that you do.

Brewster: Marta, Angie, thank you very much for this. The Internet Archive started by trying to archive the internet, but really the goal was to be in an archive on the internet, and build towards being the library of everything. How can we go and build the digital library of Alexandria? Can we make it so that there's universal access to all knowledge? That's the vision of the internet that I signed on to decades ago, to try to help build that. And with the technologies we've got now, we can actually make that come true.

Angie: And that technology is so critical, you know, the Filecoin Foundation, I got to give you guys a shout out, incredible donation to the Internet Archive as part of what you started so many years ago. When you talk about technology, Brewster, what do you envision as the future, as technology continues to evolve? 

Brewster: I remember walking into a library when I was young, and it just seemed infinite, it just goes on and on and on. But it actually isn't infinite. It only has a small fraction of even just the books, much less the periodicals and all the television and all the other things that you could want to have access to. So the idea of the internet, for me, was to make it so that anybody could be a publisher, that everybody's voice could be out there, and that there would be libraries that would go and play a role towards recording these, organizing them, making them enduringly valuable.

So the web was a good step in this direction, but it actually has a bunch of faults to it. For instance, if things are only really only available from the web server that they're coming from, and you might just have a URL, even if that movie is available from a bunch of different places, it's only got one URL, so when that one stops being accessible, then it's gone. It's a broken link. It's a footnote that doesn't work. And that's not a way to run a culture.

You basically need to have many copies, and so some of the decentralized web ideas was to try to get it so that we could have multiple copies again, like you have multiple copies in multiple libraries of published materials. We now know we actually need to depend on these materials, let's make a more robust infrastructure than what we've got with the current web.

Marta: You've always been so far ahead of this - I think only recently has there been more and more attention on the link rot problem, and the fact that if you're going and looking at a Supreme Court opinion, for example, and there's a URL there, more likely than not, that URL actually no longer points to anything. And there are these really important pieces of the Web that are just that are just not there anymore. And it's so funny, because this is a problem that's getting attention now. And I would love to hear your perspective on how you started working on these issues of decentralizing the Web and where you see it going.

Brewster: Well, we've been trying to fill a Cluj, which is the current web structure, by having the Wayback Machine, we collect almost a billion pages a day, which is completely a bogus way about doing it, I mean, what you really want is much more like a GitHub, so that you can actually roll time back on all the different websites and be able to go and roll them forward and have multiple copies and all these wonderful things. The wonderful thing about the web is it works and it's everywhere.

The bad thing is, it's really simple. So where the Wayback Machine and the Internet Archive’s collecting the web is kind of a patch, we thought, why don't we try to do something actually better? What can we do, now that we have technologies that Tim Berners Lee didn't have back then? We've got encryption that's legal. Remember, it wasn't actually back then to go and distribute, it was thought of as kind of a national security weapon or whatever. So we have that. We have JavaScript, which allows you to go and distribute running code in our web assembly. We've got hash codes that work. We saw that all go with BitTorrent and the like.

So, what do we want to have be better? Well, we want a web that's private. We want a web that's robust. We want a web where you can actually make some money by publishing on the net, not just by having advertising systems or going to some third party. And we want it to still be fun and frolicking and build something new and different. So we put out a call in 2016 to build a decentralized web. Tim Berners Lee and Vint Cerf, but also Juan Benet and others came together and said, yeah, this is towards the direction that a bunch of these decentralized technologies were going. And we said, how can we help? And now that we've got some of these pieces going, right, we've got some of the currencies going, but now we've got smart contracts.

That's cool. But can we get storage? Can we get kind of BitTorrent, but writ large, where you have hash codes to be able to go and address things as opposed to a location? These sorts of technologies are now coming about, and we'd like to see these guided into good and beneficial approaches. And that's where the Internet Archive tries to help give some sort of a North Star towards why are we trying to build these things.

Marta: The decentralized web gatherings that you've hosted have been, I think, some of the places where the most important thinking about the decentralized web has happened. And it's truly extraordinary how central a role you've played in building the next generation of the internet. I would love to hear about your early experiences and how that informs your future thinking about the internet and being such a pioneer in the space, both in archiving the internet and also in building the next generation of the web.

Brewster: So starting in 1980, we knew that a new generation of publishing was coming, it was going to be digital. It's already been talked about by Ted Nelson, by Vannevar Bush, in “As We May Think”, which is a great paper. It's still informing what's going on. So there was this vision out there that there is going to be a new publishing system. So why don't we go and build one well? And a bunch of us worked on this, but we wanted it to be open, it was the only way to have this -  it's got to be open protocols, and it's got to be a permission list system that anybody can join in on.

My contribution in that was to try to build WAIS, which was a open protocol based system that had distributed servers and distributed clients. And it really started in 1989, but it went public on the internet in 1992. And that predated the web, but it was around the time of Gopher. But it was a system that people could go and take their expertise and make it available to people that were asking natural language questions. In some sense, it's a little bit more like Siri, it actually was used to help blow some of the patents on Siri, where you basically asked the same question of a bunch of different servers and they come back with their answers and your client goes and pulls things together. And it had a simple URL system, but it also had a payment system built into it. All of that didn't make it, basically.

When the web came along, it was so simple and so easy, and people loved the web that the search systems of WAIS got incorporated, so we're probably now known best as being the first search engine on the internet, but it was a whole publishing infrastructure. The idea is to have games with many winners. How can we make it so that there's lots of people that can participate and be a winner, that you don't have to be on somebody else's platform, that you don't have to go and play by somebody else's rules. Yes, there can be laws, but they’re not contracts that go and regulate your activities. How can we make games like that? And the web was a big early step. But I think what we're seeing with some of the Web 3.0 are new approaches towards this. Let's keep the goal in mind to not have single platforms.

They are the wrong way to go. We end up with people abusing them and whether it’s the misinformation, disinformation, the algorithm crap that's going on in Facebook, but you can't go and fix the algorithms so that you're running something with better algorithms, it all is controlled by somebody else. And then when there's a big player like that, then the powers that be want them to control what information is going to be on it. And that's enormously dangerous. We do have to go and have, in Web 3.0, mechanisms of going and having filter lists that you might want to subscribe to and be able to guide yourself around, so you don't just get full of spam all the time.

But these shouldn't be centralized and controlled because it doesn't work very well. So how do we go and build a system that has lots of participants, lots of winners and lots of opportunity to build next generation things on top of it? Because no matter what we do, even this time, it's not going to be right. There's going to be a 4.0, 5.0 and 6.0.

Marta: That's a really interesting thought because thinking about Web 3.0 all day, every day, that's as far forward as I'm thinking. What do Web 4.0, 5.0 and 6.0 look like?

Brewster: I find looking at history is a good way to predict the future. I'm all about history. I like the time axis, like what went right and wrong during some of the print eras. Well, if you take the very, very early print stuff during the 1400s and 1500s, it really took until the early 1600s to get a royalty based system, so that you can get paid to go and write books, and have a very simple transaction system where you buy a book from somebody and most of the money stays with the bookseller, and then some of it goes back to the printer and some of it goes all the way back to the author. It took us a long time to evolve that.

So right now, I think we're in that early stage on the internet, where you're kind of owned by somebody else that owns your ad network. And so it's mostly work for hire. We don't have a good royalty based system. So how do we go and have that happen? Can we build that into three Web 3.0? Let's do that. If not, then let's make sure we're going to get to 4.0, 5.0 and 6.0. How do you go and make it so that there are pathways that don't have centralized points of control. So Web 3.0 may sound great, but we're going to lose out on some of the things. I love the dreamers - by going and archiving the web from the early days - you get these people that are like, wow, this is going to help democratize everything, or this is going to make it so that my little business works, and almost always you disappoint.

You're going to end up disappointing almost everybody with your new technology, whether it was television or fax machines or the web. But the dreamers were right - they were the ones that had the good ideas. And then we just didn't have a technology to be able to help them realize their dreams. So 3.0 is going to disappoint us. Let's leave open the doors to the next generation. But, let's bring forward the good works, the good movies, videos, interactive games, whatever it is that was invented in the last generations, let's emulate those or bring those forward into the new generation.

Angie: And therein lies this premise of the Future Rules podcast. We're all participating in what the Future Rules actually are going to be. And so what do you think in terms of the 4.0, 5.0, 6.0, as Marta was saying? And Marta, I got to ask you both - do you think that Web 3.0 can fulfill its promise in democratizing online information? And then what is the philosophy that you think the next generation needs to really share and grow into 4.0, 5.0 and 6.0, as you said? 

Marta: I think for me the thing that is so cool about the decentralized web is the ability to preserve humanity's most important information. I think how you preserve humanity's most important information is a really interesting question, and the Internet Archive has been doing such amazing work on preserving the internet and thinking about not just the ways that you can preserve the internet today, but also the ways that you can preserve humanity's most important information in the future.

And for me, and for Filecoin, what that means is creating a decentralized storage network where you don't have to rely on just one or two big players to store effectively all of the web. And instead creating the ability to have a system where many, or some, nodes can fail and the whole system will still be robust against failure and where you can take information and spread it out among a lot of people's computers, instead of just computers and hardware owned by a few companies. 

Brewster: Decentralized storage is absolutely critical. And having no points of absolute control is the only way to make things last a long time. The web had a page metaphor. You know, it was web pages and OK, they're just screens - well, why did he use the word page? Sort of referring back to the old Gutenberg thing as if pages last a long time. Well, web pages don't, they only last 100 days before they're changed or deleted. And so we tried to build the Wayback Machine to go and give some longevity to it. But now we have the age of feeds, I love that term, it's just so gross - it's like eat your feed. 

All right, you're sucking on your feed. It's this sort of continuous dribble that sort of spews at us, which is interactive, it's exciting, it's mesmerizing, but it's very difficult to refer back to things. Pages you thought of as something that you could refer to, and were going to last a long time, so you invested in them. Feeds, you are just babbling. And so we've had this major shift. We also have another major shift, which is online games. So we took the old offline games that we all grew up with and we started to make them interactive. So now we have feeds, we have pages, and we have got game environments. How are these going to mesh and make something great? 

How are we going to go and build something great that comes from pages, from the interactivity of feeds, from the immersive environments of games, and end up with a world that that is building something that you'd point to, whether it's the equivalent of the Taj Mahal, or the pyramids, or building genomic databases. How do we go and build technologies that encourage and incentivize our structures to basically build something great? Let's go and make things that end up with books or science or architecture or cities, that may be Web 3.0, but it might not be. So how do we go and leave the door open and keep the light on? For building something that is worthy of having millions of people spend their lives building.

Angie: How do you evolve that kind of thinking into the next system? It's just how do you curate? Like who gets to decide that this information is to be saved for posterity? And what is noise? Can't save everything.

Brewster: You can actually save everything written. I mean, people can only type 60 words a minute, 24 hours a day, and they are only, what, eight billion of us? And that actually turns out to be something you can save, whether you can make use of it later or not, or whether you should, because there's a lot of privacy issues on that, but it's when you start to get to video and the like, that it starts to get to be outside of the realm of current computers or at least current computers that are not at the AWS or maybe even Filecoin sort of sizes. So I think actually it's a more navigation system. 

It's how do you go and save materials that are worth saving and bring it to light to the right people at the right time. Google - my hat's off to them, they are astonishing at being able to take just a few words and get me a good result - it’s pretty astonishing. How do we go and continue to have that in a world where we'll be more decentralized? One problem that comes up is when it's so incentivized to make bad information. And now we have state actors paid to go and poison the commons. This is really horrible. The web doesn't have many defensive structures.

I would say the answer to bad information or bad speech is more context. So in other words, don't try to go and shut it down, but to go and make it really easy to annotate and to go and say, you know, that's not just not right, or this is coming from an angle or an organization that is known to be completely biased. So I'm hoping that we get a much more woven together web next generation where there's a way for you to not only see things coming from someplace, but to see a lot of information about it. 

So context, metadata, that will not only help us know what should be preserved, but what's worthy of being paid attention to in the first place. So a lot of this stuff is just crap. We've poisoned email. Email was a nice open system, but it's just become so poisoned, that people are going off into these other systems that are closed gardens, run by other organizations, because we weren't able to keep our garden clean. 

Marta: That's really interesting, so, I mean, you're talking a bunch about things that are really important for the next generations of the internet. And just to boil it down a little more what would be your advice to those of us who are building, Web 3.0 technologies - what are the things we should be thinking about and paying attention to and trying to optimize for?

Brewster: Try not to just be ideological about almost anything. And I think we need to come up with balances and ways to having discussions around these types of materials. Let's take the ethical issues around just posting anything you could possibly want to out there. Is that the right thing? Isn't that kind of related to ransomware? And so there's things that we're going to need to go and trade off as we move into this. The web had one characteristic that isn't with Web 3.0, which is it had a location, it had somebody that was kind of responsible for going and making it available. 

That came apart when we had the platforms where you basically use these platforms as your publication platform, as opposed to having your own web server running on your own Macintosh or whatever, but there was somebody to point to. In Web 3.0 we're not going to have that. So what's the equivalent? How do we go and have these conversations about content moderation in very different cultural contexts? I don't know the answer to that, but those are going to be some of the interesting things to unfold. 

I'm not afraid of them, but it was the question that came up when we were doing the WAIS and Gopher and the Web and FTP and e-mail and all those old systems. It'll come up again and again. So anyway, I think we just don't be too sure of oneself. And whenever we sort of stand around and go and shout and get angry at something, it's like maybe we should be a little bit more Canadian and look and say, I think I'm going to listen to the other side this time. I think that will make things work a whole lot better. 

Angie: I love it. I mean, universal access to all knowledge, that is an ideology I think we can all stand behind. What do you think are the things that we aren't aware about that you think should be top of mind as part of our future conversations. What would those things be for us?

Brewster: Watch out for centralized points of control - monopoly - but monopoly in lots of different mechanisms and ways. Those are basically individual organizations, or protocols, or approaches that seem to make it so that other things aren't allowed. So how do we go and keep a system that is flexible enough to bend around these sorts of systems? And I wonder if these large scale corporations that we've built are really going to serve us all that well. They're now starting to be larger than governments. 

So the new big entities are, I think, corporations. Maybe it was the church 500 years ago, then it was governments. And are we really kind of going into an era when corporations and corporate-think are going to dominate? That's something that I'd like to bring to mind as we're building these systems, to try to make it so that there's room for new players to join in without having to get permission from the old guys. 

We know how that ends up, it's not good. So I guess monopoly and large corporations are some of the new things, and people talk about big tech, but I don't think we talk about big content as much, where you basically have whole media types dominated by just a few companies. And that's not going to work well for fostering new and different and creative people. It'll make for a people that will feel jailed.

Marta: I would just love to hear about the future of the Internet Archive. We've only touched on it and would love to give you an opportunity to talk a little bit about all the amazing things you're doing beyond the Wayback Machine.

Brewster: The Internet Archive is turning 25 years this year, which is kind of awesome. The whole idea was to build a digital library - OK, let's say we've now got it, it’s not complete, but we're in pretty good shape, so what do we want to do with this digital library? If I had a couple goals for the next 25 years, it would be let's make it so that misinformation can be found out and sidetracked and put in context of and put into the misinformation bin. 

But then on a more positive side, can we build that, I don't know, global wisdom, global brain, global intelligence - can we make ourselves smarter as a people because we've built this technology? Can we go and refer back to the great works of somebody else and build it into our new world? Is there some way that new ways of interacting with what has come before can make us a smarter, more productive, more interesting, more fun loving, more free world? That would be my goal out of the Internet Archive - is can we help use these collective resources to make ourselves smarter? It will take everybody participating together, to fight misinformation and build a global brain, if you will.

Angie: Brewster, it has been such a pleasure to talk with you today. Founder, Digital Librarian of the Internet Archive, Advisor to the Filecoin Foundation. It was incredible having you on the show today.

Brewster: Thank you very much. 

Marta: Thank you so much, Brewster. And thanks again, Angie. 

Angie: Absolutely. You can listen and subscribe to The Future Rules anywhere you get your podcast, Fix and find the full series on the Forkast website. So we hope to meet you all here again in the future. Thanks everyone.