The Daily Static
  The Daily Static
UF Archives
Register
UF Membership
Ad Free Site
Postcards
Community

Geekfinder
UFie Gear
Advertise on UF

Forum Rules
& FAQ


Username

Password


Create a New Account

 
 

Back to UserFriendly Strip Comments Index

Final solution to HTML entities problem? by brion 2001-05-19 03:15:02

Okay, so I decided to get off my duff and go through the ARS source code like a good little geek instead of just whining randomly, so I sort of know what I'm talking about. :)

So basically the problem with HTML entities and the posting system is this:

  • When previewing, the message text is spewed directly into the <textarea> code as typed, without modification (ie, &eacute; is &eacute;)
  • But the browser will read any HTML entities (and certain tags) as HTML entities or tags, and turns entities into characters in the actual edit box (ie, &eacute; becomes é)
  • (Tags seem to be ignored in a <textarea>, except of course for </textarea>)
  • Most of the time, this is fine! Latin 1 character entities (é etc) become their Latin 1 characters and work correctly. This is why most people have stopped whining - it works for the Germans.
  • But in the rare case of a &, a < or >, or a character that's outside Latin 1 (which mostly means my Esperanto translation), this can have disastrous results when the new text is submitted:
  • &amp; becomes just &, making it hard to write whines like this which are full of entity names :) &lt; and &gt; become < and >, potentially adding new tags
  • Non-Latin 1 characters such as ĉ (&#x109;), though perfectly valid Unicode character entities displayable by all standards-compliant browsers, can't be converted back into Latin 1 for form submission and are lost entirely when subsequently previewed or submitted.
  • This is only a problem with previewing; submitting directly after typing avoids the slow breakdown of law and order. But, you can't do it on the first try - you have to preview, then fix the text up because there's no initial submit button.

The fix for this should be very simple: copy a variation of this line from ARS_FixHTML.pm::convert_html():

$self->{out} =~ s#\&#&amp;#g;

into the function that spews the message body into the <textarea>. Entities will correctly appear as their names instead of their characters in the edit box, and could be re-previewed successfully. I think this should be somewhere in ARSHandler/Comment.pm, but I'll be darned if I know where - it's late and perl is eating my brain...

It's getting close though!

[ Reply ]

 

[Todays Cartoon Discussion] [News Index]

Come get yer ARS (Account Registration System) Source Code here!
All images, characters, content and text are copyrighted and trademarks of J.D. Frazer except where other ownership applies. Don't do bad things, we have lawyers.
UserFriendly.Org and its operators are not liable for comments or content posted by its visitors, and will cheerfully assist the lawful authorities in hunting down script-kiddies, spammers and other net scum. And if you're really bad, we'll call your mom. (We're not kidding, we've done it before.)