Selecting a content archival approach

Storing a large amount of content in the content tree can have a significant impact on website performance. You can set up archival of outdated pages in your system so that you reduce the number of pages in the content tree.

We recommend creating an Archival section in the content tree for most projects, including projects with large amounts of pages (hundreds of thousands). For extremely content-heavy projects, consider using module classes or custom storage for archived content.

Archival section in the content tree

You can create a special section in the content tree into which you then move outdated pages. An even better approach is to store pages in a structured way from the start, keep the archived pages in the location, and create new sections for new content. For example, you can structure your site’s content tree by years.

When using this approach, the listings you use to display published pages need be configured to only load data from specific sub-sections of the content tree. For example, in the following structure:

  • Articles
    • 2020
      • Months
        • Articles1
        • Article 2
    • 2019
    • 2018

Instead of displaying all the years in a single listing, you can set up your pages to only cover pages stored under a specific year.

Advantages

  • You can easily display the archived pages on the live site.
  • The archived pages can retain all their data, including the editing history.
  • You can easily restore archived pages.

Disadvantages

  • The archived pages are still part of the content tree. This approach can still have a performance impact.

Note: Moving very large sections of pages in the content tree is a performance demanding task. Do not perform such operations during your website’s peak hours.

Module classes

When pages become outdated, you can move their data into module classes, and then delete the pages from the content tree.

Advantages

  • The archived pages are completely separated from the content tree.
  • You can view the archived data via the administration interface.

Disadvantages

  • To be able to store all the related objects, you have replicate their structure in module classes as well.
  • Editing history is lost.
  • The archived pages cannot be easily restored back to the content tree.

Custom storage

You can use the API to retrieve outdated pages and their related objects and store their data in some type of custom storage. For example, you can serialize the data into XML files.

Advantages

  • The archived pages are completely separated from the content tree.
  • The archived pages can retain all their data, including the editing history.
  • You can restore the archived pages.

Disadvantages

  • You cannot, by default, view the archived data via the administration interface.