Flash Text Editor 1.2.1

Flash Editor 1.2.1 is now up. A bug in the export routine that caused sibling nodes to inherit their previous sibling's properties has been fixed.

Also, please email me if you are experiencing problem highlighting text. I've had a few reports mentioning this problem and I need to fix it. It's most likely caused by the need to update the toolbar and explicitly having to place the cursor selection back into the textfield after the update. If anyone has a suggestion for a better algorithm please let me know. Stuart Schoneveld's editor doesn't seem to be affected by this problem but since he hasn't released the source I can't determine how he's accomplished this.

I'm also trying to find some time to add comments to my weblog so that people can discuss in a forum rather than simply emailing me.

Jboss Theory

I saw Marc Fleury speak at the latest Toronto Java Users Group meeting. Marc has great ideas about the next level of EJB that's produced in the Jboss product he brought into reality. Most of the room was trying to grasp the concepts while Marc kept pushing more ideas out at us, myself included.

I'm now busily working with the new Jboss 3.2 beta and learning about the 3.x interceptor concepts.

Read Marc's Blue paper "Why I Love EJB" for more information on Jboss concepts and the future of Jboss and EJB.

TheServerSide.com also has a video interview with Marc.

Flash Text Editor 1.2

Version 1.2 of my Flash Text Editor is available. Numerous additions have been made including backend support for browsers that don't support JavaScript communication with Flash and a totally redone export routine. I'll be the first to admit that the import and export code is getting out of hand and is not a great example of my typical work. Hopefully over the next two versions I'll be able to produce something much more readable. Flash is totally pass by value so my parsing needs to get a bit more creative.

Some may notice the text selection to be a bit wonky. I'm still wrestling with Flash to get the interface behaviour down pat. If anyone can discern a better algorithm please let me know. I've had a few suggestions and submissions but each implementation had certain flaws that I wasn't happy with. This version has the best behaviour yet and I will continue to try and improve it.

Please play with it and let me know if something is amuck. If you have a solution to the issues listed by all means, fire it off to me.

A special thanks on this one to Anthony Hunt who provided assistance and example code for backend communications.

Forget the Rules

Matt Haughey mentions that SpamAssasin no longer appears to be working effectively against spam. Why? Because SpamAssasin is rules based. Not only that but they display their rules list for spammers to analyze. Just like any game if you show the other team your playbook or run the same plays over and over they'll know your game.

Statistical probability filtering or Bayes filtering techniques are the only way to effectively block spam. When working at SonicBoomerang filtering news and opinion postings we quickly found that rules have inherent error due to human bias. At first we thought that classifying news and opinion was a matter or identifying simple rules. Try as you might you can't come up with a rules based system that is as effective as statistical analysis based on a large training set.

Based on my experience 2000 spam messages are an effective training set to start with. Did your filter get a false negative or false positive? No problem. Add the error to the training set and your filter just got smarter without the introduction of bias.

I had attempted to find a server side Bayes filter but couldn't install the one I found without numerous errors. I'll try again shortly and hopefully come up with a web interface so all the users on my server can have this at their disposal.

When good interfaces go crufty

I forgot to post this before but Slashdot is driving much attention to it so I figured why not now. When good interfaces go crufty. Matthew makes many great points about interface components that we have come to accept and find comfortable but which are unecessary. It's hard not to think of an application having a Save command or a File Dialog but they're really unecessary, and confusing for the novice user.

POPFile

I've been using POPFile as my new Spam filter. It uses Bayes Theorem techniques to build buckets for classifying email types. I basically set up two buckets, mail and spam and used 2000 legitimate and 2000 spam messages to train POPFile.

POPFile is a Perl script that works as a POP3 proxy. It uses statistical probability based on the training set to determine whether new mail is classified as mail or spam and tags messages with an altered subject or with the header X-Text-Classification. I use the latter method since Mozilla (my mail client) can filter based on mail headers.

So far, out of about 250 email I've had four false positives and one false negative. I'd rather have it the other way around but each false classification is collected by myself and re-inserted back into the proper training set.

Making the Case for PHP at Yahoo!

Via Slashdot - Making the Case for PHP at Yahoo!

This article, like many on Slashdot sparked quite a religious debate over programming languages. Personally I like Java/J2EE with Cocoon up front or JSP if more people need to be familiar with the language. PHP appears to work just as well but the case can be made for almost any language available. Nothing is perfect for every situation kid. There's always classic computer science tradeoffs.

It's too bad Java has threading problems on FreeBSD according to Radwin's presentation. It's seems as if that was their only reason to not use Java.