Okay, so I decided to get off my duff and go through the ARS source code like a good little geek instead of just whining randomly, so I sort of know what I'm talking about. :)
So basically the problem with HTML entities and the posting system is this:
- When previewing, the message text is spewed directly into the <textarea> code as typed, without modification (ie, é is é)
- But the browser will read any HTML entities (and certain tags) as HTML entities or tags, and turns entities into characters in the actual edit box (ie, é becomes é)
- (Tags seem to be ignored in a <textarea>, except of course for </textarea>)
- Most of the time, this is fine! Latin 1 character entities (é etc) become their Latin 1 characters and work correctly. This is why most people have stopped whining - it works for the Germans.
- But in the rare case of a &, a < or >, or a character that's outside Latin 1 (which mostly means my Esperanto translation), this can have disastrous results when the new text is submitted:
- & becomes just &, making it hard to write whines like this which are full of entity names :) < and > become < and >, potentially adding new tags
- Non-Latin 1 characters such as ĉ (ĉ), though perfectly valid Unicode character entities displayable by all standards-compliant browsers, can't be converted back into Latin 1 for form submission and are lost entirely when subsequently previewed or submitted.
- This is only a problem with previewing; submitting directly after typing avoids the slow breakdown of law and order. But, you can't do it on the first try - you have to preview, then fix the text up because there's no initial submit button.
The fix for this should be very simple: copy a variation of this line from ARS_FixHTML.pm::convert_html():
$self->{out} =~ s#\&#&#g;
into the function that spews the message body into the <textarea>. Entities will correctly appear as their names instead of their characters in the edit box, and could be re-previewed successfully.
I think this should be somewhere in ARSHandler/Comment.pm, but I'll be darned if I know where - it's late and perl is eating my brain...
It's getting close though! |