[FX.php List] Security Concerns with FileMaker Website

Jason LEWIS jasonlewis at weber.edu
Wed Jan 24 01:41:26 MST 2007


Ed,

There are a few things you can do:

1. Sue to get them to stop (I don't like this option because it makes
enemies and takes a VERY long time.)
2. use a robots.txt (This option is good for bots that actually honor
this.  Google honors the robots.txt.)
3. Detect the type of browser before you give them anything and ship
out bad information for unwanted browser types.  (This is only good if
the bot owner does not imitate browser variables.)

As for my options, I prefer #3 as I can give the bots something that
they are looking for, bad information.  In perl, I use
$ENV{HTTP_USER_AGENT}, but I am not sure how to call this in php. 
Anyone else know?  A quick search returned nothing on this.  Could this
involve $HTTP_ENV_VARS?

Jason

>>> elford at cs.bu.edu 01/23/2007 10:18 PM >>>
Hello everyone,
In the past hour, I've done some analysis of various logs and emails, 

and I've come to a chilling realization that I've never had before  
about bots harvesting information from websites -- I knew it  
happened, but I never knew the scope of the problem until tonight --  
and this is a low traffic website!

So, I have a website which contains a public listing of email  
addresses and websites from a FileMaker database.  I want to stop  
unknown bots from crawling the site.  All of the data comes out of  
FileMaker, nicely formatted as links for the end user's clicking  
convenience.  I have a solution to fix email addresses from being  
harvested, but I was wondering if anyone knows of a way to prevent  
website addresses from being harvested, but still clickable as a  
hyperlink.

I thought maybe a PHP redirect link, like redirect.php?id=16 where  
redirect puts a user at the website listed in record 16, but once the 

PHP is all said and done, we're still at the linked website, so that  
doesn't really prevent anything from being harvested.

Is there a way to maybe detect is a link was actually clicked by a  
person, and not just passed through by an automated bot?  PHP is  
preferable for such a solution -- JavaScript is too easy to turn  
off.  Or, is there a way to specify that only bots from places like  
Google, Live, and Yahoo are allowed to crawl the site?

Hopefully my predicament is clear.  I need to solve this ASAP...

--Ed
---------------------
http://www.edwardford.net 




More information about the FX.php_List mailing list