Serving static site content from inside a MediaData ZIP-file

An editor needs to regularly put static site output in their CMS 12 media system. Here's one idea I explored.

The output was delivered as ZIP-file so the plan was to only having to upload that single file.

The structure inside the archive looked like this, most files removed for brevity.

  • ./content
    • assets
      • yBWn2-
        • yBWn2_small.png
    • lib
      • player-0.0.11.min.js
      • site-e34Ade.min.css
    • index.html

Once that file was uploaded to a media system folder the editor references that file on a new page type.

First I added a separate media data type for ZIP-files.

[ContentType(GUID = "3ca219e6-10ed-4014-88d1-a884f009a571")]
[MediaDescriptor(ExtensionString = "zip")]
public class ZipFile : MediaData

Then the new page type that I named HostPageType.

[AllowedTypes(new[] { typeof(ZipFile) })]
public virtual ContentReference ZipFileReference { get; set; }

A partial router to get the remaining path "ignored" and all trailing path variations handled by the HostPageController's Index() method.

public class HostPagePartialRouter : IPartialRouter<HostPageType, HostPageType>
{
    public object RoutePartial(HostPageType content, UrlResolverContext segmentContext)
    {
        segmentContext.RemainingSegments = ReadOnlyMemory<char>.Empty;
        return content;
    }

    public PartialRouteData GetPartialVirtualPath(HostPageType content, UrlGeneratorContext urlGeneratorContext)
    {
        return null;
    }
}
..
// Don't forget to register it in your startup
services.AddSingleton<IPartialRouter, HostPagePartialRouter>();

Then the controller does the work. It handles path values, matching and streams content from inside the ZIP-file.

Having written my own static site generator, stuff like looking up mime types was something I'd been around before.

using System;
using System.IO;
using System.IO.Compression;
using EPiServer;
using EPiServer.Core;
using EPiServer.Web.Mvc;
using EPiServer.Web.Routing;
using Microsoft.AspNetCore.Mvc;
using Microsoft.AspNetCore.StaticFiles;
using Microsoft.Net.Http.Headers;
..
[HttpGet]
public ActionResult Index(HostPageType currentPage)
{
    if(ContentReference.IsNullOrEmpty(currentPage.ZipFileReference))
    {
        return this.Content("ZIP-file is missing.", "text/plain");
    }

    var url = this.urlResolver.GetUrl(currentPage.ContentLink);
    var pageUrl = url.Contains("://", StringComparison.Ordinal) ? new Uri(url).AbsolutePath : url;
    var currentUrl = this.Request.Path.Value ?? string.Empty;

    // Make sure URL used ends with /
    if(currentUrl.Equals(pageUrl.TrimEnd('/'), StringComparison.CurrentCultureIgnoreCase))
    {
        return this.RedirectPermanent(pageUrl);
    }

    var remainingPath = currentUrl.Replace(pageUrl, string.Empty, StringComparison.CurrentCultureIgnoreCase);
    const string ContentFolderPath = "content/";

    var zipFileContent = this.contentLoader.Get<Shared.Media.ZipFile>(currentPage.ZipFileReference);
    var blob = zipFileContent.BinaryData;
    using var stream = blob.OpenRead();
    using var archive = new ZipArchive(stream, ZipArchiveMode.Read);

    foreach (var entry in archive.Entries)
    {
        var nameWithOutContentPrefix = entry.FullName.StartsWith(ContentFolderPath, StringComparison.OrdinalIgnoreCase)
            ? entry.FullName.Substring(ContentFolderPath.Length)
            : entry.FullName;

        // Handle HTML files separately
        if(((string.IsNullOrEmpty(remainingPath)
           && nameWithOutContentPrefix.Equals("index.html", StringComparison.Ordinal)))
           || (nameWithOutContentPrefix.EndsWith(".html", StringComparison.OrdinalIgnoreCase)
                && nameWithOutContentPrefix.Equals(remainingPath, StringComparison.CurrentCulture)))
        {
            using var sr = new StreamReader(entry.Open());
            var html = sr.ReadToEnd();
            
            // Remove something old and not needed
            html = html
                .Replace(
                    """<script src="//html5shiv.googlecode.com/svn/trunk/html5.js"></script>""",
                    string.Empty,
                    StringComparison.CurrentCultureIgnoreCase);

            return this.Content(html, "text/html");
        }

        // Other file types than .html
        if (nameWithOutContentPrefix.Equals(remainingPath, StringComparison.CurrentCulture))
        {
            var contentTypeProvider = new FileExtensionContentTypeProvider();
            contentTypeProvider.TryGetContentType(nameWithOutContentPrefix, out var contentType);

            using var entryStream = entry.Open();
            var ms = new MemoryStream();
            entryStream.CopyTo(ms);
            ms.Position = 0;

            this.Response.Headers[HeaderNames.CacheControl] = "public, max-age=3600";

            return new FileStreamResult(ms, contentType ?? "text/plain");
        }
    }

    // Should not happen
    return this.Content(currentPage.Name, "text/plain");
}

Now the ZIP's content folder's index.html is outputted as the host page HTML response and since all the paths in the static site were relative every resource loads.

For now this just a proof-of-concept. Taking it further probably means adding some caching or other type of protection from too many handles on the ZIP-file.

You could also add something like HtmlAgilityPack and process the HTML in a better way.

Update 16 November 2023

Someone sent a comment to look at Deane Barker's Content Cloud Response Providers repository.

I had missed that project, but it looks well planned and interesting.

Something to evaluate and compare with my more moderate setup if you face the same use case.

Comments?

Published and tagged with these categories: Optimizely, CMS, ASP.NET