It can be hard to find sites that have disappeared from the Internet. But the Internet Archive’s Wayback Machine is on the verge of rolling out a feature that will make tracking down dead websites much easier, according to Internet Archive founder Brewster Kahle.
The Wayback Machine has been helping people see past Internet sites over the past 15 years, but searchers always needed to know the URL of a website to find the archived copies. Soon, however, you’ll be able to use keyword searches to find old websites – in fact, you can already test it out through a public beta.
The new search feature is not quite like Google, where all the text on each page on a website is indexed to help with searches. The Wayback Machine feature lets you search for an archived website’s main page, although it does not have the capacity to enable searches for specific webpages on that site. But once you’re there, you’re able to navigate around the old websites.
(Also see: Internet Archive Wants to Backup Its Data in Canada, Fearing Trump)
“There’s a billion” websites, Kahle explained, “and that’s all we can get to work.”
But the feature is still a big step in terms of usability. Kahle said that he, too, has struggled with the lack of keyword search availability in the Wayback Machine. “I use the Wayback Machine mostly by going to Google, finding a URL and then going to the Wayback to find previous versions of it,” said. “But what if the site is completely gone?”
The ever-changing online ecosystem is partly to blame for all of that hoop jumping: The Internet’s history is filled with the digital corpses of entire sites that are hard, if not impossible, to find through Google.
Take Geocities. It was once a hugely popular hosting service that was acquired by Yahoo but is now defunct everywhere except Japan. Today, the first Google search result for “geocities” is a Wikipedia page about the service, followed by stories about its demise, but not the actual geocities.com URL, which currently redirects to a page about Yahoo services for small businesses.
But if you search for “geocities” in the test version of the Wayback Machine, the first result is geocities.com. That result shows the tool captured over 200 million copies of some 37 million pages that were once hosted on the domain.
And that’s the real power of the Wayback Machine’s new search: It makes it easier for us to see how the Web used to look.