Hello world!

January 29, 2009

OK

so i parked here.

i don’t update this thing at all. it’s just here to make sure someone else doesn’t ‘park’ on top of my username.

the real fun starts here.

that is all.

Emulating Enumerators in Javascript

July 23, 2006

Download this code from my Yahoo! Groups web site.

I’ve been doing a lot of work with JavaScript recently. My current work project has me thinking a lot about ‘advanced’ JavaScript, ‘Object-Oriented’ JavaScript, and things of that nature. And I’ve learned quite a bit! While sharing some of this with my friend Bob of Intensity Software, he encouraged me to blog more about what I’m finding. So I’m starting a set of short pieces on JavaScript that I hope prove interesting.

Enumerators Are Handy

Now that I’m writing more code in JavaScript, I’m finding that I use enumerators to make my code more readable and portable. While MSIE has an Enumerator built-in object, the Mozilla-based browsers do not. So I needed to come up with something that works across browsers. My solution takes advantage of a very powerful feature of JavaScript – associative arrays.

Associative Arrays

Associative arrays are collections that link "keys" to "values." Nothing big there. But the way JavaScript deals with associative arrays is very powerful. You can access members of associative arrays either using the familiar key-index pattern (MyOptions["Yes"]=1) or you can use a property name pattern (MyOptions.Yes=1). It is this second access method that I decided to use for an enumerator implementation.

Defining an Enumerator Object

So, using associative arrays, it’s easy to create an enumerator pattern in JavaScript. Below is a simple code example:

    // define enumerator
    var Role = 
    {
        Visitor     : 0,
        User        : 10,
        Author      : 20,
        Editor      : 30,
        Publisher   : 40,
        Admin       : 100
    }

That’s all there is, really. Now I can write familiar JavaScript to access the members as needed. For example, I can use Role.User to get the value of 10. Of course, I could also use Role["User"] and get the same value. This second version points to a handy way to iterate through an enumeration collection.

Iterating an Enumerator

Sometimes I wants to step through a list of enumerator members. I might want to create a dropdown list to allow users to select a member. I might want to use this enumerator pattern to express a list of menu options and then iterate through those options to create a menu of links. There are lots of possibilities. Below is some code to ‘walk’ through an enumerator and create an unordered list to display on a web page.

        var elm = document.getElementById("output");
        var str = "";
        var cnt = 0;

        str = "<ul>";

        for(r in Role)
            str += "<li>" + r + ": " + Role[r] + "</li>";

        str += "</ul>";

        elm.innerHTML = str;

Summary

So that’s it. I used JavaScript’s associative arrays pattern to emulate enumerators that can work across all browsers that support JavaScript. I can use typical enumerator access patterns as well as iteration patterns to access the contents as needed. There are lots of other possible ways to use associative arrays within JavaScript. I’ll leave that to you to explore until my next JavaScript post[grin].


Technorati Tags

I tag my posts for easy indexing at Technorati.com


XML Namespaces and SQL2005

June 24, 2006

on my last INETA trip (to Plano, TX) i was asked a question about generating XML from SQL2005 that included support for XML namespaces. Unfortunately, I didn’t have my act together at that moment and was not able to show off this cool feature of SQL2005. But it is actually very easy to generate namespace-enabled XML using a new feature of SQL2005 – the NAMESPACES keyword

The NAMESPACES Keyword

There are a number of powerful new keywords in SQL2005 that make it easier to support XML. One of them is NAMESPACES. AS you would expect, this keyword is used to output XML from SQL Server that includes the proper namespace designations.

For example, let’s assume you want to create the following output from the AUTHORS data table from the PUBS database:

<rdf:RDF xmlns:xmlp="http://www.amundsen.com/rdf/xmlp/1.0/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
  <rdf:Description>
    <xmlp:id>123-45-6789</xmlp:id>
    <xmlp:firstname>mike</xmlp:firstname>
    <xmlp:lastname>amundsen</xmlp:lastname>
    <xmlp:phone>123-456-7890</xmlp:phone>
    <xmlp:address>123 main st</xmlp:address>
    <xmlp:city>byteville</xmlp:city>
    <xmlp:state>md</xmlp:state>
    <xmlp:zip>94609</xmlp:zip>
  </rdf:Description>
</rdf:RDF>

Creating the query that results in the above output is actually pretty straight forward. I’ll go through the steps below and toss in a few other nice features of SQL2005 along the way.

Step-By-Step XML Namespace Output from SQL2005

All you need to do is to build your query as you normally would. First, the simple SELECT statement to get the data you need:

select * from pubs.dbo.authors

Next you need to tell SQL Server to return an XML version of the data:

select * from pubs.dbo.authors
for xml raw, elements

The above query is nice, but is lacking a very important item – the root element. This was a classic problem with SQL2000 – so much so that when Microsoft released the SqlXml assembly for .NET, they included a workaround method that allowed programmers to set the root on the client site. However, the release of SQL2005 gave the SQL team a chance to fix this omission. Now, all you need to do is add the ROOT keyword to your query like this:

select * from pubs.dbo.authors
for xml raw, elements, root('RDF')

One more thing. With the above format, each collection of fields is enclosed in an element called "row." Not too creative, and not what we need. You can control the enclosing element name for each row by decorating the ROW keyword with a string name like this:

select * from pubs.dbo.authors
for xml raw('Description'), elements, root('RDF')

So far, so good. We have solid (valid) XML output, but no namespaces yet. here’s the secret sauce built into SQL2005. you preface the query with a list of namespaces to include at part of the root element. It works like this:

with xmlnamespaces 
(
'http://www.w3.org/1999/02/22-rdf-syntax-ns#' as rdf,
'http://www.amundsen.com/rdf/xmlp/1.0/' as xmlp
)
select * from pubs.dbo.authors
for xml raw('Description'), elements, root('RDF')

Now the output includes xmlns elements in the root tag. That’s good – we have namespaces now! However, we also need to decorate the various elements in the output with the proper namespace prefixes. That works like this:

with xmlnamespaces 
(
    'http://www.w3.org/1999/02/22-rdf-syntax-ns#' as rdf,
    'http://www.amundsen.com/rdf/xmlp/1.0/' as xmlp
)
select
    au_id as 'xmlp:id', 
    au_fname as 'xmlp:firstname',
    au_lname as 'xmlp:lastname',
    phone as 'xmlp:phone',
    address as 'xmlp:address',
    city as 'xmlp:city',
    state as 'xmlp:state',
    zip as 'xmlp:zip'
    from pubs.dbo.authors
for xml raw('rdf:Description'), elements, root('rdf:RDF')

Note that I also cleaned up the element names and added namespace designations
to the row and root elements. Now the final output will include XML namespace
designations for each element as needed.


Technorati Tags

I tag my posts for easy indexing at Technorati.com


Basic WS-Auth Example Using ASP.NET 2.0

April 25, 2006

During one of my recent INETA talks, several people asked for examples of Web Service authentication. Unfortunately, my talk did not include any. It’s been a while, but I finally found some time to put together a simple example built using ASP.NET 2.0 and Visual Studio 2005.

NOTE: I’ve posted the ASP.NET 2.0/VS2005 project for this article in the Files section of the MikeAmundsen Yahoo! Group.

The SOAP Header

First, the way SOAP authentication works is by adding a special header to the SOAP message. This is a fundamental part of the SOAP message model; the ability to add an arbitrary number of custom headers to the message payload without changing the actually body of the message itself. In effect, we are adding ‘out-of-band’ content to the SOAP message. As long as both the sender and receiver understand the header items, everything works fine.

For this example, I’ve created a simple SOAP header with two elements: username and password. Using VS2005 and ASP.NET 2.0, I can create class that holds this data. This will be the authentication header for my secure SOAP messages.

Below is the entire class definition for the custom authentication header:


using System;
using System.Web.Services.Protocols;

public class AuthSoapHeader : SoapHeader
{
	public AuthSoapHeader(){}

    public string Username = string.Empty;
    public string Password = string.Empty;
}

Note that my custom class "AuthSoapHeader" inherits from the SoapHeader class in the System.Web.Services.Protocols. that’s all I need for now. The magic of authentication comes later.

The Secure HelloWorld method

Now, I’m ready to create a secure Web Service method. Using VS2005, I add a new Web Service class to my Web project. Since it already contains a sample HelloWorld method, I’ll just modify that method to require a successful authentication before returning requested data.

Here’s my service *before* adding the authentication header support:


public class StandardService : System.Web.Services.WebService {

    public StandardService () {}

    [WebMethod]
    public string HelloWorld() 
    {
        return "Hello World";
    }
    
}

And here’s my service *after* adding the authentication header support.


public class SecureService : WebService 
{
    public AuthSoapHeader AuthHeader;
    
    public SecureService (){}

    [WebMethod]
    [SoapHeader("AuthHeader")]
    public string HelloWorld() 
    {
        try
        {
            if (AuthHeader.Username == "hello" && AuthHeader.Password == "world")
                return "Hello World";   // it's all good
            else
                return "unauthorized access"; // auth failed
        }
        catch (Exception ex)
        {
            // build and return soap exception
            XmlQualifiedName code = new XmlQualifiedName("hello-world");
            SoapException soapEx = new SoapException(ex.Message, code);
            throw soapEx;
        }
    }
    
}

Note that all I needed to do is add a public instance of the header class, then add an attribute reference that tells the ASP.NET runtime to expect a header. Finally, I add some code to check the contents of the header and act accordingly. If the header is missing, I throw a SOAP exception back to the caller.

That’s all there is to it. I now have a service that requires authentication. Next I need to create an ASP.NET WebForm that calls this service using the header.

Calling the Secure Service from a WebForm

Creating an ASP.NET WebForm that calls the secure Web service is not much different from creating a WebForm that calls a standard, unsecured, Web service. After using VS2005 to add a Web Reference, I’m ready to get an instance of the remote SOAP header and Web service class.

I created a simple form with Username and Password input controls, a button, and a label to hold the results of the WS call. Below is the code that runs behind the button click event:


protected void login_Click(object sender, EventArgs e)
{
    // define locals
    string uname = string.Empty;
    string pword = string.Empty;
    string results = string.Empty;

    // get user inputs
    uname = username.Text;
    pword = password.Text;

    // get refs to remote objects
    wsSecureService.SecureService wsService = new wsSecureService.SecureService();
    wsSecureService.AuthSoapHeader wsHeader = new wsSecureService.AuthSoapHeader();

    // populate header and attach to service object
    wsHeader.Password = pword;
    wsHeader.Username = uname;
    wsService.AuthSoapHeaderValue = wsHeader;

    // get results
    results = wsService.HelloWorld();
    showresults.Text = results;

}

Notice that way to call a secured Web service is to first get an instance of any required headers, fill them out as needed and then attach these headers to the usual SOAP object before making the service call. As long as the headers are populated correctly, the call will complete as usual.

There’s nothing more to it. Now you know how to add custom authentication headers to any Web Service.


Technorati Tags

I tag my posts for easy indexing at Technorati.com


Speaking in Chattanooga April 11, 2006

April 6, 2006

I’m looking forward to my trip to Chattanooga, TN on April 11th to deliver another INETA-sponsored talk. The Chattanooga Area .NET User Group (CHADNUG) has selected my Message-Oriented Architecture (MOA) topic for the event. This is currently the most-requested talk in my list and I really enjoy delivering it.
I have been giving some variation on this talk for close to two years. I update it often and
this year is no exception. This time, I’ll be adding a section on supporting Ajax-enabled web clients.

If you are close Chattanooga, check out thier web site and come on out to the event. I look forward to seeing everyone there!


Technorati Tags

I tag my posts for easy indexing at Technorati.com


Wanted: URI Designer

April 6, 2006

Some regular readers know that, as part of my ongoing personal project to
improve my ‘web-tech’ knowledge, I have been re-reading Tim Berners-Lee’s "Style
Guide for online hypertext
" and other related materials. One of the common
messages from documents on this topic is the importance of well-composed and
maintained web addresses or URIs. This got me thinking about (and paying more
attention to) the common web addresses that I see in my browser. I must say, I
don’t care much for what I see.

What’s bad about common URIs today?

Too often the URI I see in my browser address line is gibberish. Just try
visiting any of the top news sites (news.yahoo.com, http://www.msn.com, news.google.com,
etc.) and click on any of the links on the page. Usually the URI contains
additional state information (?x=13&y=29&_docid=DUsor93FH). Almost always, the
URI contains company or technology-specific information (page.aspx, document.jsp,
article.cfm, product.php). And almost never could I share the web address with a
friend by merely speaking it. This is all bad.

So what is the definition of a good URI?

The W3C has an excellent document called "Common
HTTP Implementation Problems
". In it there is a section devoted to "Understanding
URIs
." This section reads like a ‘best practices’ list for creating solid
URIs. I urge everyone to take a few minutes to read though it and to bookmark it
for future reference. I’ll lift two quotes from that document to clarify the
need for good URI design when deploying web applications.

Here’s the first quote: "A URI is a reference to a resource, with fixed and
independent semantics."
This sentence has quite a bit packed into it. For
example:

  • "A URI is a reference…" In other words while a URI points to something,
    that URI is not a serial number.
  • "…with fixed [semantics]…" This means that the URI does not change
    over time. Changing a URI breaks other people’s links to that resource – bad
    stuff.
  • "…with independent semantics." This means that the URI stands alone. It
    does not depend on state information such as cookies or session state.

So, a URI is a pointer so something. That pointer never goes bad, and that
pointer stands alone (or, put another way, is easily shared).

Here’s the second quote: "A common mistake … is to think [that a URI] is
equivalent to a filename within a computer system. This is wrong. URIs have,
conceptually, nothing to do with a file system."
This might come as a shock
to some web programmers. It is so easy to expose physical folders and files via
a web server that, by default, most web sites simply reflect the file structure
behind a web domain root. This, too, is bad stuff. Move a file, and the URI
breaks.

Ok, so URIs are non-changing, stand-alone pointers and *not* reflections of
folder and file structures on disk. Maybe we do need a URI design!

URIs are web queries

Once you get over the idea that URIs are not physical files in folders, you are
free to start thinking about what URIs really represent. In my mind, a URI is a
‘web query.’ By typing a URI, users are ‘looking for something’ out on the
Internet. By now, most web users understand that there are up to three parts to
a URI query:

  • the server name or domain (www.someserver.com)
  • the folder name or location (/articles/2006/)
  • the document name (learning_to_program.html)

I suspect that most users do not think very much about the above details, but
most intuit them as they surf. I am often especially surprised by the
sophistication of young web surfers. I have observed children who are quite
happy to ‘hack’ away at a URI in order to find a document. They truly use the
browser address line as a search tool!

Anyway, if you accept the idea that a URI is (in some fashion) a web query, then
you are free to actively *design* the URIs for your web application to support
this kind of use. To paraphrase the words of Tim Berners-Lee, you can make your
URIs ‘hack-able.’

Creating hackable URIs

What’s a hackable URI? In its simplest form, it’s a URI that can be easily
modified by a user in order to get a valid result from the same server. The most
common way to think about hackable URIs is to make sure that all sub-parts of
the URI return a valid document. As an example, the URI "www.myserver.com/content/programming/tutorials/hackable_uris.html"
has several sub-parts. Users who ‘land’ at this location should be able to ‘lop
off’ parts of the URI and get helpful results. That means that the URI "www.myserver.com/content/programming/tutorials/"
should return something – maybe a page that lists all tutorials. And "www.myserver.com/content/programming/"
might return a list of programming article classes such as "tutorials," "reference,"
"bookreviews," etc. And so on.

But creating hackable URIs doesn’t mean just supporting sub-parts. It could also
mean using a user-friendly URI scheme that actually *invites* URI hacking. For
example, what can you assume if you land at a URI that looked like this?

www.contentserver.com/archives/2005/11/03/dailyupdate.html

Not only can you assume that you can get valid documents at each sub part. You
can also assume that you can change the value of some sub-parts to discover new
documents, right?

So how do you implement a URI design?

Once you start thinking about a URI design pattern that works for your site, you
need to come up with a way to implement it. In the past, web programmers would
start creating folders and files to match the stated design. This is not the way
to go about it. Instead, web programmers should design a server-side scripts
that can scan the incoming URI, treat it as a request query and assemble a
response accordingly.

For example, given the following query:

www.authorserver.com/fiction/poetry/

A server might return a list of poets. Users might also assume that they can get
lists of other authors by changing the URI like this:

www.authorserver.com/fiction/shortstories/
www.authorserver.com/fiction/novels/

The point is that web servers should be able to do more than just serve up
documents from a physical folder tree.

One way to do this (using ASP.NET, for example) is to use the Uri.Segments
collection to inspect and parse the URI. Here’s a trivial example.

Given the URI http://www.server.com/archives/2005/12/ a server could create
a query against a database table called "archives" for a list of documents added
to the system in December of 2005.

Here’s some code to parse the URI:

<%@ page %>
<script runat="server" language="c#">

    void Page_Load(object sender, EventArgs args)
    {
        string webaddress = "http://www.server.com/archives/2005/12/";
        Uri thisUri = new Uri(webaddress);
        int segcount = thisUri.Segments.Length;
        string output = string.Empty;

        output = string.Format("<p>webaddress:<br />{0}</p>",webaddress);

        // get segments
        output +=string.Format("<p>segments:<br/>");
        for(int i=0;i<segcount;i++)
            output+=string.Format("{0}: {1}<br />",i,thisUri.Segments[i]);
        output +="</p>";

        // format data query
        string table = thisUri.Segments[1].Replace("/","");
        string yr = thisUri.Segments[2].Replace("/","");
        string mo = thisUri.Segments[3].Replace("/","");
        string query = "select * from {0} where yr={1} and mo={2}";

        output+=string.Format("<p>query:<br />"+query+"</p>",table,yr,mo);

        // show results
        Response.Write(output);
    }

</script>

And here’s the output created by the above code:

webaddress:
http://www.server.com/archives/2005/12/
segments:
0: /
1: archives/
2: 2005/
3: 12/
4: 31/
query:
select * from archives where yr=2005 and mo=12

Submitting the above query might return a data set that could be formatted into
an HTML page containing a series of links for the user to explore.

OK, I get the idea, but there’s more to it, right?

Well, yes. Knowing that URIs are static, independent resource pointers that
should be ‘hackable’ by users and that ASP.NET has features that allow you to
parse URIs into parts that can be used to create data queries is just the
beginning. But you can use this information to create a more flexible and long-
lived URI design for web apps. And with a URI design in place, you are no long
dependent on the existence (or lack there-of) of physical documents within your
web.

There are also a number of other operations needed to support a good URI design.
While good URIs don’t change, content does. Well-implemented URI responders will
need to handle moved documents (HTTP 301 and 302 events) through a lookup table
or some other means. Also, once you start to train users to ‘hack’ URIs at your
server, you’ll need to add improved support for 4xx (not found) and possibly 5xx
(server error) events to tell users when their creative URIs fail.

In a future article, I’ll outline a URI design that I’ve been contemplating for
some time. I also plan to share my implementation for this new URI design
sometime soon. But don’t wait for me. Start designing and implementing your own
backable URIs!


Technorati Tags

I tag my posts for easy indexing at Technorati.com


CINNUG Event Re-Scheduled for March 28th

March 26, 2006

I’m happy to announce that my ‘Formula for Web 2.0’ talk for the Cincinnati .NET User Group (CINNUG) that was postponed due to unseasonable snow last Tuesday has been rescheduled for this coming Tuesday, March 28th.

I’ll be presenting at 6PM at the MaxTrain offices in Cincinnati, Ohio. Check out the CINNUG.ORG web site for details.


Technorati Tags

I tag my posts for easy indexing at Technorati.com


Forcing XHTML-compliance with ASP.NET Response Filters

March 20, 2006

As part of my ongoing effort to build Web solutions that, by default, support open standards, I have committed to only emitting XHTML-compliant markup on all pages. While some of this will require rewriting static pages, the bigger task will be to ensure XHTML output for generated pages usually the ones generated by ASP.NET from database queries.This means I need a process for scanning the markup before final output to the client. If anything is non-XHTML, I need to either correct it automatically or refuse to output it. While the last option is a bit harsh, its worth considering. If the output is wrong, dont allow it.

NOTE: You can download the stand-alone C# source code for the TidyFilter class from http://groups.yahoo.com/group/mikeamundsen. You need to register to access the downloads.

Response.Filter or HTTPModule?

The real work is to hook into ASP.NET somewhere and scan the markup before final output to the client. There are a couple possibilities: HTTPModules and Response Filters. I recently started experimenting with using ASP.NET response filters to control output to the client. They are easy to install (much easier than setting up an HTTPModule) and provide quite a bit of flexibility.

There are a number of resources on the Internet covering the pros and cons of HTTPModules or Response.Filters. The biggest tipping point IMHO is that HTTPModules can be implemented as stand-alone filters that can be easily plugged-in to any existing ASP.NET application. Of course, to do this, you need to set up some config items and, in some rare cases, need to be aware of other HTTPModules in the pipeline and how your module will be affected.

Reponse.Filters on the other hand are very simple. Basically, you implement a Stream object, write some rules on inspecting and modifying the stream as is goes by, and then hook this stream into the Response object when needed. Its more of an inline solution, IMHO. One that works well when you want to integrate the filter right into the compiled solution instead of making it a plug-able component like HTTPModules.

Writing a Response.Filter Stream

Since I want to make XHTML-compliance a fundamental part of my Web solutions, Ive decided to implement my filter as a Response.Filter stream instead of an HTTPModule. That means I need to write a stream object that can scan for outgoing markup and, if needed, modify the output or refuse to deliver it to the client. Its actually a pretty simple operation.

As mentioned above, Response.Filters are really just stream objects with a bit of smarts. Implementing a stream object requires just a small bit of code. Below is a basic stream object that does nothing (yet).

using System;
using System.Web;
using System.IO;
using System.Text;
using System.Text.RegularExpressions;

namespace amundsen.xmlp
{
	public class TidyFilter : Stream
	{
		private Stream _sink;
		private long _position;
		StringBuilder sb;

		public TidyFilter()
		{
		}

		public TidyFilter(Stream sink)
		{
			_sink = sink;
			sb = new StringBuilder();
		}

		public override bool CanRead
		{
			get {return true;}
		}

		public override bool CanSeek
		{
			get {return true;}
		}

		public override bool CanWrite
		{
			get {return true;}
		}

		public override void Close()
		{
			_sink.Close();
		}

		public override void Flush()
		{
			_sink.Flush();
		}

		public override long Length
		{
			get {return 0;}
		}

		public override long Position
		{
			get {return _position;}
			set {_position = value;}
		}

		public override void SetLength(long length)
		{
			_sink.SetLength(length);
		}

		public override long Seek(long offset, System.IO.SeekOrigin direction)
		{
			return _sink.Seek(offset, direction);
		}

		public override int Read(byte[] buffer, int offset, int count)
		{
			return _sink.Read(buffer, offset, count);
		}

		public override void Write(byte[] buffer, int offset, int count)
		{
			_sink.Write(buffer, offset, count);
		}
	}
}

Youll notice that the real work is done in the Write method. Currently, this class only passes the contents in the read buffer out to the write buffer without change. Its in the Write method that Ill addcode to look for any markup being sent out and do some magic to make sure its XHTML-compliant.

Before getting to the XHTML filtering part, its worth noting how Ill hook up my stream object to the ASP.NET Response stream. The process is very easy. Below is a bit of code I have in my HTTPHandler that processing outgoing requests:


context.Response.Filter = new TidyFilter(context.Response.Filter);

Notice that I get an instance of my stream object, pass in the current Response.Filter pointer, and add that to any existing Reponse.Filters currently running. Nothing complex here. My stream object is now part of the ASP.NET response process. All I need now is the ability to force XHTML-compliance on any HTML in the outgoing stream.

Using HTMLTidy to filter outgoing markup.

Rather than slave over some complicated regexp or other routines to try to inspect and alter markup as it goes by, I decided to use a very powerful existing utility that already does all that HTMLTidy. You can download the open source HTMLTidy project for free. Although it was originally built for the Unix/Linux platform, there is a very solid Win32 implementation available at the site. There is also a nice .NET binding for HTMLTidy implementation that makes it easy to use HTMLTidy as part of any ASP.NET application. The details on this binding set includes instructions on registering the DLL and assemblies to make them easily available within a .NET project.

Once I have the HTMLTidy library and .NET bindings installed and registered, theres the small matter of accessing HTMLTidy within my Filter stream at runtime. This is all done in the Write method of my stream. Below is the complete code block I added to the Write method.

public override void Write(byte[] buffer, int offset, int count)
{
	if (HttpContext.Current.Response.ContentType.ToLower().IndexOf("html") == -1)
		_sink.Write(buffer, offset, count);
	else
	{
		try
		{
			string inbuf = System.Text.UTF8Encoding.UTF8.GetString(buffer, offset, unt);

			Regex eof = new Regex("</html>", RegexOptions.IgnoreCase);

			if (!eof.IsMatch(inbuf))
			{
				sb.Append(inbuf);
			}
			else
			{
				sb.Append(inbuf);
				string work = sb.ToString();
				Regex notidy = new Regex("<tidyfilter='false'-->",RegexOptions.IgnoreCase);
				if (!notidy.IsMatch(work))
				{
					string tidyconfig = "";
					string tidyerrors = "";
					tidyconfig = HttpUtilities.GetConfigValue("tidyconfig");
					tidyconfig = HttpContext.Current.Server.MapPath(tidyconfig);
					if (tidyconfig.Length == 0)
						return;

					tidyerrors = HttpUtilities.GetConfigValue("tidyerrors");
					tidyerrors = HttpContext.Current.Server.MapPath(tidyerrors);

					Tidy.DocumentClass tdoc = new Tidy.DocumentClass();

					tdoc.LoadConfig(tidyconfig);
					tdoc.SetErrorFile(tidyerrors);
					tdoc.ParseString(work);
					tdoc.CleanAndRepair();
					tdoc.RunDiagnostics();
					work = tdoc.SaveString();
				}

				byte[] outbuf = System.Text.UTF7Encoding.UTF8.GetBytes(work);
				_sink.Write(outbuf, 0, outbuf.GetLength(0));
			}
		}
		catch (Exception ex)
		{
			byte[] errbuf = System.Text.UTF7Encoding.UTF8.GetBytes(
"<html><body><h2>xmlp</h2><h3>TidyFilter</h3>" +
ex.Message +
"</body></html>"); _sink.Write(errbuf, 0, errbuf.GetLength(0)); } } }

You can see in the code above that I first check the content-type header to see if the stream passing through is part of an HTML document sent from the server. If it is an HTML document, I add the string contents of the in-bound buffer to a string object for later review. Once Ive reached the end of the html document (</html>), I am ready to inspect the document for XHTML-compliance using the HTMLTidy library.

I have a little html-comment trick to allow me to tell the filter to ignore the tidy filter, but if thats not in the document, the code will check a couple configuration settings (I use an internal routine to get the config value, but you can use the standard AppSettings[key] collection if you like) and then load the HTMLTidy library and scan the completed document. Once the work is done, HTMLTidy can return me the resulting markup as a string and then I write that string to the out-bound buffer.

Thats all there is to it. Now I have a way to ensure XHTML-compliant markup for all my pages.

Some caveats

Of course, there are some downsides to allthis. First, you need to install and use HTMLTidy (or some other code base) to inspect all outgoing markup. Second, most utilities for scanning markup need access to the entire document. That means you need to load the document into memory, scan it and output it. This is fine for relatively small documents, but large ones could chew up memory and slow the response-time of your application.

Finally, while HTMLTidy is good, its not perfect. Once you start using a tool like this you start to see the challenge in validating markup documents. Especially ones that include HTML, XML, client-side script, embedded CSS, and more. Ive used HTMLTidy enough to know that its best to keep CSS and Javascript to a minimum in the page and to use external references to other documents when possible. Youre mileage may vary[grin].

Summary

XHTML-compliant markup is the first step in building solid, standards-compliant Web solutions. ASP.NET has options for building and installing output filters that can inspect outgoing markup and modify the output as needed. HTMLTidy is a very solid open source library that provides the details of XHTML-compliance including a .NET binding for easy use within ASP.NET.


Technorati Tags

I tag my posts for easy indexing at Technorati.com


Added Cincinnati .NET User Group Talk for March

March 11, 2006

I’m happy to announce that I will be doing a talk for the Cincinnati .NET User Group (CINNUG) on March 21st. The topic is a new one I just added to my list – "The Web 2.0 Formula." It’s based on an article I posted here at my MSN spaces blog as well as a number of project initiatives I;m invlived with this year.

If you’re in the Cincinnati area, check out the CINNUG web site and stop on in!


Technorati Tags

I tag my posts for easy indexing at Technorati.com


The Web 2.0 Formula

March 11, 2006

(XHTML+CSS2) * JS
------------------ = Web 2.0
XML+XSLT+RDBMS

In February of 2005 Jesse James Garret of Adaptive Path wrote an interesting article on the convergence of a number of existing technologies and how they could affect the future of programming for the Web. Now, a year later, this article is recognized as the first in a series of definitive works on what has come to be known as "Web 2.0." Based on concepts covered in that article, this talk presents a set of "principles and practices" that make up a "Web 2.0 formula" for building successful leading-edge Web-based solutions like the ones featured as Google, Yahoo, and Microsoft Live.

Topics covered include the use of compliant XHTML for page markup; Cascading Stylesheets for layout and design; and Javascript to power the client-side experience. In addition, the use of Relational databases as repositories; XML as a data transport format; and XSL technologies such as XSLT, XSL-FO, XPath, XInclude, and XQuery to transform and modify XML data is explored. The talk also includes several live code examples and references to valuable libraries and resources available to jumpstart your Web 2.0 applications.

Whether you are just exploring the idea of Web 2.0 or are already committed to rolling out Web 2.0-compliant solutions, this talk will help you learn more about the theory, practice, and effects of Web 2.0 on Internet-based applications.


Technorati Tags

I tag my posts for easy indexing at Technorati.com