Web Archiving in SEOnaut

SEOnaut now has web archiving capabilities and can store crawled data in a WACZ file. When enabled, the raw response’s body and headers will be available for review or download.

You’ll also be able to browse the archived websites with the built-in proxy. This allows you to review your site as it was at the time of the crawl, helping you identify changes that affect your SEO efforts.

The WACZ file is available in the export section and you can use it with any other tool that supports the format.

What is Web Archival?

With web archival we capture and preserve the contents of a website at a specific time. Saving a web page may only capture visible content. Web archival includes everything—HTML, CSS, JavaScript, images, and even response headers. This snapshot preserves all the data of the site.

The web archival community plays a vital role in preserving digital history. Organizations like the Internet Archive capture and store web snapshots. Their work keeps valuable information available for future generations.

How to Use SEOnaut’s Web Archival capabilities

Enable WACZ archival in the crawler’s options before starting your site crawl. SEOnaut will create an archive snapshot as it crawls and collects your site’s data. The web archive file will be available in your project’s export options.

You can find links to the raw data and the archive browser for each resource in your project’s resource view.

Regular Archiving

Make web archival a regular part of your routine. Archive your website at key milestones, such as after significant updates or before major changes. This will give you a historical record to refer back to.

Analyze Archived Data

Use archived data to conduct historical analysis. Compare past versions of your site to identify what changes have impacted your performance. This can help you make more informed decisions moving forward.

Leverage Archived Content

If your live site experiences issues, you can use archived content to quickly restore functionality. This ensures that your users and search engines always have access to your content.