Sunday, July 29, 2007
I had a requirement to import about 1800 pages from our existing Intranet into a new MOSS based site. Here are a few resources I've used which might be useful if you're doing something similar.
- Programmatically Adding Pages to a MOSS Publishing Site
I decided to put together a tool that would walk an XML file exported from the existing site describing the structure of the content, creating MOSS sites (from custom site definitions) and importing pages as it went. At its core is code based on Andrew Connell's Programmatically adding pages to a MOSS Publishing site post.
- Html Agility Pack - A .NET Html Parser
To ensure that links within our content continue to function it was necessary to parse each Html page and pull out the href attribute from each A tag so that the Url could be rewritten. Html Agility Pack is a wonderful .NET library that makes this simple; you load Html from files, streams or strings and query for Html nodes using XPATH. A simple href rewriting sample would look something like this:
I've ended up using this for a lot more than rewriting links as this proved the perfect opportunity to modify some ids that are better applied as classes, set certain links to open in new windows and replace tokens used by the existing CMS.
- Automatically Publishing All Items in a Publishing Site
If you're doing anything like this you'll invariably find yourself with a few hundred pages to check-in, publish or approve at some point. Well, Mr. Connell has saved us some work again, his extensions to stsadm.exe make publishing all of your pages as simple as:
Note: There's a bug in the version I downloaded which results in an infinite loop when using the 'includesubsites' option; the solution is documented in the comments of Andrew's post.
- 3rd Party Tools
Depending on your specific situation it might be appropriate to look at 3rd party tools and although I've never used it, Metalogix Migration Manager looks to be a comprehensive solution. (Their site is down at the time of posting. In the meantime, you can read more in this post by Stefan Gossner.) If there are any other tools I should mention leave a comment and I'll add them here.