J. R. Boynton

Content Management Systems

Abstract: Content management systems are key to running an efficient website. Keep the development group out of the loop on updating content, and you will move ever so much faster. “Content” doesn’t need the same kind of source control that scripts and templates need. Specific design suggestions follow.

First: you must have a content management system (CMS). Second: use your own technology. If you use ASP or JSP or Cold Fusion, build your CMS in that. Same if you use Java, Python, whatever.

Why use your own site technology? Because you want to minimize the number of skillsets (teams) required to operate your business. For each technology, you probably need three people: what if one quits, and one gets sick? You can live in fear, of course.... A CMS in your site technology also gives you a place to train new people: new features in a CMS are probably allowed to take longer than new features the website, so there’s time for a new employee to learn, and then get it right before letting people use it. On your external website, you probably have to get it right the first time.

What is “Content”?

For this essay, I’m really considering content that flows through the site independently of the development group... that doesn’t affect templates, scripts, global navigation, etc. Think of a newspaper site. The articles and article-related images are “content” by this definition, while the logo and images related to navigation are different: if the newspaper site gets a new “look and feel” the content will stay the same, but many images and even some text will change. This essay is about the content that wouldn’t be affected by look and feel changes.

Approaches to Content Management “Solutions”

Old-fashioned content management software companies that sold huge systems to giant corporations are still in the game. Documentum comes to mind. I’m not entirely sure Arbor Text falls into that category. My general impression is that they are very powerful, but far more than you need for a website, and probably not nearly fast enough to develop for to keep up with the internet pace.

Source control systems are more than you need for content management. Visual Source Safe, CVS, TeamSite, ClearCase are all very fine when you have lots of programmers working on code with interdependencies, but the content creators won’t appreciate the overhead, and I don’t think the overhead buys you anything, relative to a much simpler system. Overhead: every writer/editor has to learn to use and understand the source control system. Most people won’t like to use a command line interface. There’s also the “per seat” cost of the big systems. Maybe everyone in your company needs a “seat”. The price can be very high. Or you exclude as many people as possible from the system, which is also not so good.

There are some very expensive, web-oriented content/commerce site development tools: Vignette, BroadVision, OpenMarket. As the Forrester Group said, these are “immature” products. A mature product does what it does really well, and does it reliably. CVS, the free source control system, is like that. It works. Vignette kind of does what you want it to do, and it usually works. To me, “usually works” means just about the same as “doesn’t work”. I want software for which “works” means that it always works.

The September, 2000 Seybold Seminar in San Francisco was littered with content management system software companies with software that’s cheaper than Vignette and that category. As far as I know, any of them would be fine, but they cost money and require resources to customize.

I, of course, am arguing for the build-your-own approach. Content management systems don’t need to be very complicated. It helps if you can evolve them quickly. They actually are the right scale of project for all these server-side scripting environments that mix code and html.

The thing is, you must have a CMS, and you must customize it before you start, and maintain it over time.

CMS design

On micromanagement.... “Workflow” systems promise to send the user an email whenever he/she needs to process some item in the system. That’s fine in a demo, but how much email do you want people to get? You get to choose, of course, but I would encourage people to start with less structure, rather than more.

Once assigned a task, a user needs to access the right piece of content, whether it’s a file or stored in the database. A CMS web page either lists all available content, allows a search, or lets the user see which items are already assigned to him/her (“checked out”).

Another CMS page shows the details for that content. Also use the CMS web forms to collect “metadata” such as keywords, index tokens, a short description, etc.

Once an item is checked out, the CMS can simply copy a file to the user’s directory. When the user is done with it, remove it from that directory, and put it in a “waiting for approval” space (or just flagged in the database). Before check-in, it would be convenient if the user can save versions of the file, in order to revert to an older version if there’s a mistake or a change in the assignment.

Content could include images. If so, you should really store the source file (from Photoshop, for example) as well as the generated image that might be a GIF or JPEG.

The CMS can do significant document processing. For example, if the source document is a Word file, the CMS could convert the Word document to XML markup, validate the XML, convert the XML to HTML, and display the HTML to preview.

Disk space is now cheap. Save all approved versions. There’s no reason to do CVS’s trick to keep many versions of a document in the same source file. The CMS can also use the database to track of the history of changes: who did what, when, and why.

Typically there is an approval process. The writer is done. The editor reviews it. Perhaps the writer works on it some more.

If someone else needs to make a change to the file before it has been approved, they should get the waiting-for-approval version. You don’t want a situation where one user’s changes have to be abandoned or approved before another user can work on the file. It’s not at all uncommon to have a team working on the same content. One way to allow this, though it is rather dangerous, is just to allow anyone to work on a file in someone else’s directory. As long as that is rare, and they check with the “owner” before doing it, it’s a fairly simple approach to a complicated problem. The danger comes when they do frequently and if they don’t check with the owner first.

Another aspect of the CMS may be to control where the content appears on the website. A CMS page could allow you to control the order of articles that are listed in your news section, for example. It requires more attention, but it may be worth it. Alternatively, you could just have rules that the most recent article is listed at the top, and the oldest gets moved to an archive.

Integrating with Source Control

There are several possibilities here. If you use ClearCase and some other very expensive source control packages, you probably couldn’t separate the content, even if you wanted to. With TeamSite, it may be you could create a slightly separate interface to the content, and then you might want to use TeamSite, rather than building your own. If you already hired someone maintain the TeamSite system, why not let that person maintain the content management system, as well as the source control system?

The real issue is when there are important dependencies between the content and the templates, scripts, etc. Say you have a whole new look and feel that is ready to go, and when it launches, you want several new articles to launch at the same time.

The first answer is to make this kind of dependency rare. After all, that is this essay’s definition of content.

The second answer is that it would be very convenient if the same launch process moves content and everything else to the live servers. For example, if you have a staging server and an internally accessible server that matches the live servers, a simple script can identify the files that are different between the two, and move any differences all at once.

The third answer is even more basic. Coordinate. Launch the new content, then launch the other files.

If you are concerned about the time between the two launches, you need a pretty fancy launch system. (I’d say move the entire directory structure of the updated site to the server, parallel to the existing site. Then change the directories the web server points to and restart the server. Instant update.)

Caveat

A CMS needs user management and authentication. Who can do what, to what, and to whom.... I contend that this is the most difficult aspect of building and integrating web applications. It’s non-trivial to develop and maintain, though there are plenty of options. You have to decide, of course, how secure you need the CMS to be. Depending on security and how easy you want maintaining users to be, this could be a small or large project. For many sites, there won’t be a reasonable equivalent for the internet users.

Thus, for many sites, developing user management and authentication will be extra overhead to creating their own CMS.

But even if it is non-trivial, it is not a huge project, and it isn’t terribly difficult. And we all know pretty much what it should do. So I would say go ahead and build it.




Copyright © 1998-2011 J. R. Boynton