[FX.php List] Odd browser bug or?

Erik Andreas Cayré erik at cayre.dk
Tue May 1 10:59:30 MDT 2007


Den 01/05/2007 kl. 22.30 skrev Dale Bengston:

> I guess the value of $_SERVER['PHP_SELF'] is the culprit here.  
> Depending on the user agent/user's browser, this probably gets url  
> encoded or not. So, why not run the value of $_SERVER['PHP_SELF']  
> through something like this:
>
> $php_self = htmlspecialchars(urldecode($_SERVER['PHP_SELF']));

Sorry, I don't understand you point.

$_SERVER['PHP_SELF'] lives inside PHP, so it is unaffected by User  
Agents...
I use it only for redirect purposes, so it is only seen by the HTTP  
layer (right?)


> ...and be sure your error code is evaluating what you think it's  
> evaluating?

Good point.


This is an example of what I'm trying to understand. The following  
was sent via email from my custom error reporting:

Wed, 25 Apr 2007 23:47:02 +0200
		IP: 222.46.18.34
		Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET  
CLR 1.1.4322)
		URL: /?p=region&region=4
		Referrer: http://www.dagkort.dk/?p=region&region=4
		Username: None
		Comment: undefined url

The Comment shows the trigger came here:

		case 'region':
			$regionid = (isset($_GET['region'])) ? intval($_GET['region']) : "";
			if ($regionid == "") {
				header("Location: http://" . $_SERVER['HTTP_HOST'] . dirname 
($_SERVER['PHP_SELF']));
 >				reporterror("undefined url");
				exit;

This is my custom error reporting function:

// Report a notice or error
	function reporterror($comment) {
		
		global $webmastermail;
		$agent = $_SERVER['HTTP_USER_AGENT'];
		$uri = $_SERVER['REQUEST_URI'];
		$user = (isset ($_SERVER['PHP_AUTH_USER'])) ? $_SERVER 
['PHP_AUTH_USER'] : "None";
		$ip = $_SERVER['REMOTE_ADDR'];
		$ref = (isset ($_SERVER['HTTP_REFERER'])) ? $_SERVER 
['HTTP_REFERER'] : "None";
		$dtime = date('r');
		
		$to = $webmastermail;
		$subject = 'From php: ' . html_entity_decode($comment);
		$message = $dtime . '
		IP: ' . $ip . '
		Agent: ' . $agent  . '
		URL: ' . $uri . '
		Referrer: ' . $ref . '
		Username: ' . $user . '
		Comment: ' . html_entity_decode($comment);
		$headers = 'From: www.dagkort.dk <www at dagkort.dk>' . "\r\n" .  
"Reply-to: " . $webmastermail . "\r\n" . 'X-Mailer: PHP/' . phpversion 
();

		mail($to, $subject, $message, $headers);
	}

I still can't see what's up...

> Dale
>
> On May 1, 2007, at 12:08 AM, Erik Andreas Cayré wrote:
>
>> It seems I may not have provided enough background for my question...
>>
>> First: I'm fully aware the ampersand has special meaning in both  
>> HTML and URLs.
>> Most browsers will tolerate ampersands in URLS in HTML, which are  
>> unencoded, and (not) fail gracefully when stumbling on one of  
>> these, unless (as Kevin pointed out) it identifies a valid entity.
>>
>> After getting the site working in july 2006 – this is the first  
>> PHP/HTML project I've done by myself – I started debugging all the  
>> beginner errors I had made. One method was running pages through  
>> validator.w3.org and that helped med catch many bugs including  
>> unencoded ampersands in URLS in my HTML code.
>>
>> I also set up a custom error.php file for Apache, mostly to catch  
>> users' bookmarks to the old site (which had been done in lasso 3)  
>> and give them a better experience than stumbling on a page not  
>> found error.
>> In addition I call a homegrown error function, everytime I  
>> discover something went wrong, to analyze and possibly fix/enhance  
>> the site.
>>
>> The error I'm asking about is reported by my custom error  
>> reporting at a point in my code, where some of the URL was  
>> recognized but the rest is not.
>>
>> From my index.php:
>>
>> // select page to show
>> 	if (isset($_GET['p'])) {
>> 		$p = $_GET['p'];
>> 	} else {
>> 		$p = '';
>> 		$link = '/?p=';
>> 	}
>> 	if (isset ($p) &&  $p != 'login') {
>> 		$link = '/?p=' . $p;
>> 	} else {
>> 		$link = '/';
>> 	}
>>
>> 	switch ($p) {
>>
>> 		case 'assoc':
>> 			$associd = (isset($_GET['assoc'])) ? intval($_GET['assoc']) : '';
>> 			if ($associd == '') {
>> 				header('Location: http://' . $_SERVER['HTTP_HOST'] . dirname 
>> ($_SERVER['PHP_SELF']));
>> 				reporterror('undefined url');
>> 				exit;
>> 			} elseif ($associd < 1) {
>> 				header('Location: http://' . $_SERVER['HTTP_HOST'] . dirname 
>> ($_SERVER['PHP_SELF']));
>> 				reporterror('undefined associd: "' . $_GET['assoc'] . '"');
>> 				exit;
>> 			} else {
>> 				$page = 'pages/assoc.php';
>>
>> The error being reported is an attempt to access
>> /?p=assoc&amp;asscoc=58 (or another number)
>>
>> This should only ever happen in case some HTML contains
>> /?p=assoc&amp;amp;assoc=58
>>
>> I have been unable to find any of my HTML looking like the above.
>>
>> So my question remains: has anyone ever heard of a bug elsewhere  
>> which might create this?
>> (I'm not saying it's not my problem. I just can't think of where  
>> it might be...)
>>
>>
>> Den 01/05/2007 kl. 7.06 skrev Kevin Futter:
>>
>>> On 30/4/07 6:31 PM, "Erik Andreas Cayré" <erik at cayre.dk> wrote:
>>>
>>>> I've spent some hours looking through my error log for  
>>>> www.dagkort.dk
>>>> to fix whatever may be left to fix.
>>>>
>>>> One recurring error which I don't undestand is this:
>>>>
>>>> URL: /?p=assoc&amp;assoc=58
>>>>
>>>> To the best og my knowledge noone should be accessing an URL like
>>>> this, instead accessing:
>>>>
>>>> URL: /?p=assoc&assoc=58 (which works fine)
>>>>
>>>> I've checked my site (though not completely exhaustively), and I
>>>> couln't find any links misspelled to result in the above...
>>>> I see the error generated by several different User-Agents, both
>>>> browsers (MSIE 5.0 Win98) and crawlers (eg. nicebot)
>>>>
>>>> Doeas anyone on the list know of some bug or other plausible
>>>> explanation for this?
>>>> I'm guessing certain browsers/crawlers mey erroneously attempt to
>>>> access an URL like the above, but I'm not certain.
>>>>
>>>> Any suggestions?
>>>
>>> As Dale has already pointed out, &amp; is the HTML entity  
>>> representing the
>>> ampersand character. It's actually a requirement of the spec that  
>>> all
>>> ampersands in HTMl, INCLUDING URLs*, be encoded (either by entity or
>>> character reference). So, the URL causing the error is actually  
>>> not only
>>> legitimate, but matching the spec exactly, and shouldn't be  
>>> causing an
>>> error. I'd say that the user agents involved are choking on it.  
>>> However, if
>>> you're not doing any manual or automatic encoding yourself, the real
>>> question becomes how did it get there?
>>>
>>> * The reason for this is that compliant browsers treat the  
>>> ampersand as the
>>> beginning of an entity, and that's its only valid function is  
>>> HTML. So,
>>> query string joins using the ampersand risk being interpreted as  
>>> entities,
>>> and if the characters that follow the ampersand actually make up a
>>> recognisable entity, they'll be parsed as such and the URL will  
>>> fail (I've
>>> seen it happen!). If you encode the ampersand as an entity, it's  
>>> parsed
>>> properly as an ampersand, not the beginning of an entity. Sounds  
>>> circular I
>>> know, but that's how it works.
>>>
>>>
>>> -- 
>>> Kevin Futter
>>> Webmaster, St. Bernard's College
>>> http://www.sbc.melb.catholic.edu.au/
>>>
>>>
>>> #################################################################### 
>>> #################
>>> This e-mail message has been scanned for Viruses and Content and  
>>> cleared
>>> by MailMarshal
>>> #################################################################### 
>>> #################
>>>
>>> This e-mail and any attachments may be confidential. You must not  
>>> disclose or use the information in this e-mail if you are not the  
>>> intended recipient. If you have received this e-mail in error,  
>>> please notify us immediately and delete the e-mail and all  
>>> copies. The College does not guarantee that this e-mail is virus  
>>> or error free.  The attached files are provided and may only be  
>>> used on the basis that the user assumes all responsibility for  
>>> any loss, damage or consequence resulting directly or indirectly  
>>> from the use of the attached files, whether caused by the  
>>> negligence of the sender or not. The content and opinions in this  
>>> e-mail are not necessarily those of the College.
>>> _______________________________________________
>>> FX.php_List mailing list
>>> FX.php_List at mail.iviking.org
>>> http://www.iviking.org/mailman/listinfo/fx.php_list
>>
>>
>>
>> ---
>> Erik Andreas Cayré
>> Spangsbjerg Møllevej 169
>> 6705 Esbjerg Ø
>>
>> Privat Tel: 75150512
>> Mobil: 40161183
>>
>> ---
>> »Interesse kan skabe læring på en skala sammenlignet med frygt,  
>> som en nuklear eksplosion i forhold til en kineser.«
>> --Stanley Kubrick
>>
>> »Kun p....sure mennesker kan ændre verden. Innovation skabes ikke  
>> af 'markedsanalyse', men af folk, der er afsindigt irriterede over  
>> tingenes tilstand «
>> --Tom Peters
>>
>> »Hvis du ikke kan forklare det simpelt, forstår  du det ikke godt  
>> nok.«
>> -- Albert Einstein
>>
>> »Hvis du ikke har tid til at gøre det rigtigt, hvornår vil du så  
>> have tid til at lave det om?«
>> -- John Wooden, basketball coach
>>
>>
>> _______________________________________________
>> FX.php_List mailing list
>> FX.php_List at mail.iviking.org
>> http://www.iviking.org/mailman/listinfo/fx.php_list
>
> _______________________________________________
> FX.php_List mailing list
> FX.php_List at mail.iviking.org
> http://www.iviking.org/mailman/listinfo/fx.php_list



---
Erik Andreas Cayré
Spangsbjerg Møllevej 169
6705 Esbjerg Ø

Privat Tel: 75150512
Mobil: 40161183

---
»Interesse kan skabe læring på en skala sammenlignet med frygt, som  
en nuklear eksplosion i forhold til en kineser.«
--Stanley Kubrick

»Kun p....sure mennesker kan ændre verden. Innovation skabes ikke af  
'markedsanalyse', men af folk, der er afsindigt irriterede over  
tingenes tilstand «
--Tom Peters

»Hvis du ikke kan forklare det simpelt, forstår  du det ikke godt nok.«
-- Albert Einstein

»Hvis du ikke har tid til at gøre det rigtigt, hvornår vil du så have  
tid til at lave det om?«
-- John Wooden, basketball coach




More information about the FX.php_List mailing list