UI form component reference extractors
The system tracks usage of content items to help editors determine the impact of modifying content items on the published data. The system tracks all content item references created using the default system form components (combined content selector, rich text editor). For all custom UI form components that contain links to content item assets or references to content items (e.g., a custom URL selector, a custom rich text inline editor), you need to implement a custom reference extractor.
Implement reference extractors
Implement a custom class based on where you want to extract references from:
Page Builder component properties extractors
To extract references from Page Builder component properties (widget, section, and page template properties), implement the IContentItemReferenceExtractor
interface in a custom class. Within the Extract
method, process the content of the property and return a collection of ContentItemReference
objects, which contain GUID Identifier
of content items that are linked in the property.
You need to register the extractor and assign the extractor to all properties where references to content items were created using custom form components.
Content type fields extractors
To extract references from content type fields, implement the IFormFieldContentItemReferenceExtractor
interface in a custom class. The interface requires you to implement the following methods:
Extract
– processes the content of the field and returns a collection ofContentItemReference
objects, which contain GUIDIdentifier
of content items that are linked in the field.CanExtractReferences
– returns a bool value that specifies whether the extractor supports the field type specified in theFormFieldInfo
parameter. To determine whether a field contains references that can be extracted, use the field’s data type (fieldInfo.DataType
property) and the form control used by the field (fieldInfo.GetComponentName()
method).
You need to register the extractor. After any content item is modified, content type fields of the item are checked against all registered IFormFieldContentItemReferenceExtractor
implementations and their CanExtractReferences
methods to determine if references need to be extracted from said fields and the extraction is performed using the Extract
method of the suitable extractor.
The following example shows the implementation of the default reference extractor used by the system to extract references from rich text fields:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text.RegularExpressions;
using AngleSharp.Html.Dom;
using AngleSharp.Html.Parser;
using CMS;
using CMS.ContentEngine;
using CMS.Core;
using CMS.FormEngine;
...
public class ContentItemReferenceExtractor : IFormFieldContentItemReferenceExtractor
{
private const string CONTENT_ITEM_GROUP_NAME = "ContentItemGuid";
private const string GUID_REGEX = @"[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}";
private readonly HtmlParser htmlParser;
private readonly Regex contentItemLinkRegex = new Regex(@$"getContentAsset\/(?<{CONTENT_ITEM_GROUP_NAME}>{GUID_REGEX})\/{GUID_REGEX}\/", RegexOptions.Compiled | RegexOptions.IgnoreCase);
public ContentItemReferenceExtractor()
{
htmlParser = new HtmlParser();
}
// Returns whether the field in the parameter is supported by this extractor
public bool CanExtractReferences(FormFieldInfo field)
{
return string.Equals(field.DataType, FieldDataType.RichTextHTML, StringComparison.InvariantCultureIgnoreCase) ||
(string.Equals(field.GetComponentName(), RichTextEditorConstants.IDENTIFIER, StringComparison.OrdinalIgnoreCase) &&
(field.DataType.Equals(FieldDataType.Text, StringComparison.Ordinal) || field.DataType.Equals(FieldDataType.LongText, StringComparison.Ordinal)));
}
// Returns a collection of reference object found in the field
public IEnumerable<ContentItemReference> Extract(object fieldValue)
{
if (fieldValue is not string html)
{
return Enumerable.Empty<ContentItemReference>();
}
var referencesGuids = new HashSet<Guid>();
var document = htmlParser.ParseDocument(html);
referencesGuids.UnionWith(document.Images.Where(image => contentItemLinkRegex.IsMatch(image.Source))
.Select(image => GetContentItemReferenceGuidFromLink(image.Source)));
var anchors = document.QuerySelectorAll("a").Select(anchors => anchors as IHtmlAnchorElement);
referencesGuids.UnionWith(anchors.Where(anchor => contentItemLinkRegex.IsMatch(anchor.Href))
.Select(anchor => GetContentItemReferenceGuidFromLink(anchor.Href)));
return referencesGuids.Select(guid => new ContentItemReference { Identifier = guid });
}
// Converts plaintext links to GUIDs
private Guid GetContentItemReferenceGuidFromLink(string source)
{
return Guid.Parse(contentItemLinkRegex.Match(source).Groups[CONTENT_ITEM_GROUP_NAME].Value);
}
}
Universal extractors
It is generally recommended to use individual extractors for distinct fields or properties. However, as the IFormFieldContentItemReferenceExtractor
inherits from the IContentItemReferenceExtractor
, you can create a universal extractor for all fields and properties where the references are stored in the same format.
Register reference extractors
To register a reference extractor, use the RegisterImplementation
assembly attribute. This attribute ensures the reference extractor is recognized by the system.
[assembly: RegisterImplementation(typeof(IFormFieldContentItemReferenceExtractor),
typeof(ContentItemReferenceExtractor),
Lifestyle = Lifestyle.Singleton,
Priority = RegistrationPriority.Default)]
Assign reference extractors to Page Builder component properties
After you implement and register an instance of the IContentItemReferenceExtractor
, you need to assign it to all Page Builder component properties (widget, section, and page template properties) that may contain references to content items. Use the TrackContentItemReference
attribute and specify the type of your extractor as a parameter:
public class MyCustomWidgetProperties : IWidgetProperties
{
[MyCustomEditingComponent]
[TrackContentItemReference(typeof(MyCustomExtractor))]
public string PropertyName { get; set; }
}