[FX.php List] Odd browser bug or?

Dale Bengston dbengston at preservationstudio.com
Tue May 1 08:30:45 MDT 2007


I guess the value of $_SERVER['PHP_SELF'] is the culprit here.  
Depending on the user agent/user's browser, this probably gets url  
encoded or not. So, why not run the value of $_SERVER['PHP_SELF']  
through something like this:

$php_self = htmlspecialchars(urldecode($_SERVER['PHP_SELF']));

...and be sure your error code is evaluating what you think it's  
evaluating?

Dale

On May 1, 2007, at 12:08 AM, Erik Andreas Cayré wrote:

> It seems I may not have provided enough background for my question...
>
> First: I'm fully aware the ampersand has special meaning in both  
> HTML and URLs.
> Most browsers will tolerate ampersands in URLS in HTML, which are  
> unencoded, and (not) fail gracefully when stumbling on one of  
> these, unless (as Kevin pointed out) it identifies a valid entity.
>
> After getting the site working in july 2006 – this is the first PHP/ 
> HTML project I've done by myself – I started debugging all the  
> beginner errors I had made. One method was running pages through  
> validator.w3.org and that helped med catch many bugs including  
> unencoded ampersands in URLS in my HTML code.
>
> I also set up a custom error.php file for Apache, mostly to catch  
> users' bookmarks to the old site (which had been done in lasso 3)  
> and give them a better experience than stumbling on a page not  
> found error.
> In addition I call a homegrown error function, everytime I discover  
> something went wrong, to analyze and possibly fix/enhance the site.
>
> The error I'm asking about is reported by my custom error reporting  
> at a point in my code, where some of the URL was recognized but the  
> rest is not.
>
> From my index.php:
>
> // select page to show
> 	if (isset($_GET['p'])) {
> 		$p = $_GET['p'];
> 	} else {
> 		$p = '';
> 		$link = '/?p=';
> 	}
> 	if (isset ($p) &&  $p != 'login') {
> 		$link = '/?p=' . $p;
> 	} else {
> 		$link = '/';
> 	}
>
> 	switch ($p) {
>
> 		case 'assoc':
> 			$associd = (isset($_GET['assoc'])) ? intval($_GET['assoc']) : '';
> 			if ($associd == '') {
> 				header('Location: http://' . $_SERVER['HTTP_HOST'] . dirname 
> ($_SERVER['PHP_SELF']));
> 				reporterror('undefined url');
> 				exit;
> 			} elseif ($associd < 1) {
> 				header('Location: http://' . $_SERVER['HTTP_HOST'] . dirname 
> ($_SERVER['PHP_SELF']));
> 				reporterror('undefined associd: "' . $_GET['assoc'] . '"');
> 				exit;
> 			} else {
> 				$page = 'pages/assoc.php';
>
> The error being reported is an attempt to access
> /?p=assoc&amp;asscoc=58 (or another number)
>
> This should only ever happen in case some HTML contains
> /?p=assoc&amp;amp;assoc=58
>
> I have been unable to find any of my HTML looking like the above.
>
> So my question remains: has anyone ever heard of a bug elsewhere  
> which might create this?
> (I'm not saying it's not my problem. I just can't think of where it  
> might be...)
>
>
> Den 01/05/2007 kl. 7.06 skrev Kevin Futter:
>
>> On 30/4/07 6:31 PM, "Erik Andreas Cayré" <erik at cayre.dk> wrote:
>>
>>> I've spent some hours looking through my error log for  
>>> www.dagkort.dk
>>> to fix whatever may be left to fix.
>>>
>>> One recurring error which I don't undestand is this:
>>>
>>> URL: /?p=assoc&amp;assoc=58
>>>
>>> To the best og my knowledge noone should be accessing an URL like
>>> this, instead accessing:
>>>
>>> URL: /?p=assoc&assoc=58 (which works fine)
>>>
>>> I've checked my site (though not completely exhaustively), and I
>>> couln't find any links misspelled to result in the above...
>>> I see the error generated by several different User-Agents, both
>>> browsers (MSIE 5.0 Win98) and crawlers (eg. nicebot)
>>>
>>> Doeas anyone on the list know of some bug or other plausible
>>> explanation for this?
>>> I'm guessing certain browsers/crawlers mey erroneously attempt to
>>> access an URL like the above, but I'm not certain.
>>>
>>> Any suggestions?
>>
>> As Dale has already pointed out, &amp; is the HTML entity  
>> representing the
>> ampersand character. It's actually a requirement of the spec that all
>> ampersands in HTMl, INCLUDING URLs*, be encoded (either by entity or
>> character reference). So, the URL causing the error is actually  
>> not only
>> legitimate, but matching the spec exactly, and shouldn't be  
>> causing an
>> error. I'd say that the user agents involved are choking on it.  
>> However, if
>> you're not doing any manual or automatic encoding yourself, the real
>> question becomes how did it get there?
>>
>> * The reason for this is that compliant browsers treat the  
>> ampersand as the
>> beginning of an entity, and that's its only valid function is  
>> HTML. So,
>> query string joins using the ampersand risk being interpreted as  
>> entities,
>> and if the characters that follow the ampersand actually make up a
>> recognisable entity, they'll be parsed as such and the URL will  
>> fail (I've
>> seen it happen!). If you encode the ampersand as an entity, it's  
>> parsed
>> properly as an ampersand, not the beginning of an entity. Sounds  
>> circular I
>> know, but that's how it works.
>>
>>
>> -- 
>> Kevin Futter
>> Webmaster, St. Bernard's College
>> http://www.sbc.melb.catholic.edu.au/
>>
>>
>> ##################################################################### 
>> ################
>> This e-mail message has been scanned for Viruses and Content and  
>> cleared
>> by MailMarshal
>> ##################################################################### 
>> ################
>>
>> This e-mail and any attachments may be confidential. You must not  
>> disclose or use the information in this e-mail if you are not the  
>> intended recipient. If you have received this e-mail in error,  
>> please notify us immediately and delete the e-mail and all copies.  
>> The College does not guarantee that this e-mail is virus or error  
>> free.  The attached files are provided and may only be used on the  
>> basis that the user assumes all responsibility for any loss,  
>> damage or consequence resulting directly or indirectly from the  
>> use of the attached files, whether caused by the negligence of the  
>> sender or not. The content and opinions in this e-mail are not  
>> necessarily those of the College.
>> _______________________________________________
>> FX.php_List mailing list
>> FX.php_List at mail.iviking.org
>> http://www.iviking.org/mailman/listinfo/fx.php_list
>
>
>
> ---
> Erik Andreas Cayré
> Spangsbjerg Møllevej 169
> 6705 Esbjerg Ø
>
> Privat Tel: 75150512
> Mobil: 40161183
>
> ---
> »Interesse kan skabe læring på en skala sammenlignet med frygt, som  
> en nuklear eksplosion i forhold til en kineser.«
> --Stanley Kubrick
>
> »Kun p....sure mennesker kan ændre verden. Innovation skabes ikke  
> af 'markedsanalyse', men af folk, der er afsindigt irriterede over  
> tingenes tilstand «
> --Tom Peters
>
> »Hvis du ikke kan forklare det simpelt, forstår  du det ikke godt  
> nok.«
> -- Albert Einstein
>
> »Hvis du ikke har tid til at gøre det rigtigt, hvornår vil du så  
> have tid til at lave det om?«
> -- John Wooden, basketball coach
>
>
> _______________________________________________
> FX.php_List mailing list
> FX.php_List at mail.iviking.org
> http://www.iviking.org/mailman/listinfo/fx.php_list



More information about the FX.php_List mailing list