Simple JavaScript HTML parser

Once I needed a very simple and fast HTML parser written in actionscriptunlike John Resig I didn’t want to pretty print or fix broken markup; all I needed to is the structure and the attributes – and since it had to be run many times in a second, I tried not to use regexp. I know, it’s silly – but it works.

Now I “ported” that to javascript and eventually to a jEdit macro – why is this good? Because unlike a full blown SAX parser, this tool will not choke on php tags, so I use it to select the innerHTML or outerHTML in a text/html/php file – or just to find a matching tag. Unfortunately right now it parses the full file, so it’s a bit slow, but I may fix that sometime in the future.

Furthmore since this is javascript, it would be very easy to port it to editors with a javascript macro engine (Notepad++ has a rough one, and we also have Aptana, but I find Aptana’s – pretty much discontinued – scriptmonkey buggy and unreliable).

Download the jEdit javascript macro from here

Comments are closed.