[FX.php List] Security Concerns with FileMaker Website

Gjermund Gusland Thorsen ggt667 at gmail.com
Thu Jan 25 11:56:28 MST 2007


Rule number one is never put an email address or mailto on a public site.

ggt667

On 1/25/07, Joel Shapiro <jsfmp at earthlink.net> wrote:
> Can anyone briefly explain the risks of these bots and how "nasty" (a
> term used in this thread) they can really be?
>
> I've looked at the link Gjermund posted, and it looks like the
> biggest real problems are email harvesting and use of bandwidth.
> Ed's original post was because he has many emails and websites on his
> site, but if there are just one or two contact email addresses on a
> site, what's the big risk?
>
> Thanks for enlightening me.
>
> -Joel
>
>
> On Jan 25, 2007, at 10:11 AM, Edward L. Ford wrote:
>
> > I didn't think about checking user agents.  I just did a cursory
> > investigation of this, and I may personally implement this system.
> > For the bots that say they're a real browser, I'll set up other
> > roadblocks to keep the info on my site (relatively) protected from
> > harvesting.
> >
> > For others interested in checking user agents, I found the
> > following to investigate:
> > o) look in the $_SERVER['HTTP_USER_AGENT'] string.  More info @
> > http://us3.php.net/reserved.variables   There's a built in
> > get_browser() function in PHP, which I never knew.
> >
> > o) PEAR also has some tools:  Check out http://pear.php.net/package/
> > Net_UserAgent_Detect
> >
> > --Ed
> >
> > -----------------------------------
> > http://www.edwardford.net
> >
> > On Jan 25, 2007, at 11:41 AM, Gjermund Gusland Thorsen wrote:
> >
> >> Alternative 3 works in theory but still leaves some bots crawling,
> >> as the worst bots tells your webserver that it's the most popular
> >> browser.
> >>
> >> ggt667
> >>
> >> On 1/24/07, Jason LEWIS <jasonlewis at weber.edu> wrote:
> >>> Ed,
> >>>
> >>> There are a few things you can do:
> >>>
> >>> 1. Sue to get them to stop (I don't like this option because it
> >>> makes
> >>> enemies and takes a VERY long time.)
> >>> 2. use a robots.txt (This option is good for bots that actually
> >>> honor
> >>> this.  Google honors the robots.txt.)
> >>> 3. Detect the type of browser before you give them anything and ship
> >>> out bad information for unwanted browser types.  (This is only
> >>> good if
> >>> the bot owner does not imitate browser variables.)
> >>>
> >>> As for my options, I prefer #3 as I can give the bots something that
> >>> they are looking for, bad information.  In perl, I use
> >>> $ENV{HTTP_USER_AGENT}, but I am not sure how to call this in php.
> >>> Anyone else know?  A quick search returned nothing on this.
> >>> Could this
> >>> involve $HTTP_ENV_VARS?
> >>>
> >>> Jason
> >>>
> >>> >>> elford at cs.bu.edu 01/23/2007 10:18 PM >>>
> >>> Hello everyone,
> >>> In the past hour, I've done some analysis of various logs and
> >>> emails,
> >>>
> >>> and I've come to a chilling realization that I've never had before
> >>> about bots harvesting information from websites -- I knew it
> >>> happened, but I never knew the scope of the problem until tonight --
> >>> and this is a low traffic website!
> >>>
> >>> So, I have a website which contains a public listing of email
> >>> addresses and websites from a FileMaker database.  I want to stop
> >>> unknown bots from crawling the site.  All of the data comes out of
> >>> FileMaker, nicely formatted as links for the end user's clicking
> >>> convenience.  I have a solution to fix email addresses from being
> >>> harvested, but I was wondering if anyone knows of a way to prevent
> >>> website addresses from being harvested, but still clickable as a
> >>> hyperlink.
> >>>
> >>> I thought maybe a PHP redirect link, like redirect.php?id=16 where
> >>> redirect puts a user at the website listed in record 16, but once
> >>> the
> >>>
> >>> PHP is all said and done, we're still at the linked website, so that
> >>> doesn't really prevent anything from being harvested.
> >>>
> >>> Is there a way to maybe detect is a link was actually clicked by a
> >>> person, and not just passed through by an automated bot?  PHP is
> >>> preferable for such a solution -- JavaScript is too easy to turn
> >>> off.  Or, is there a way to specify that only bots from places like
> >>> Google, Live, and Yahoo are allowed to crawl the site?
> >>>
> >>> Hopefully my predicament is clear.  I need to solve this ASAP...
> >>>
> >>> --Ed
> >>> ---------------------
> >>> http://www.edwardford.net
> >>>
> >>>
> >>> _______________________________________________
> >>> FX.php_List mailing list
> >>> FX.php_List at mail.iviking.org
> >>> http://www.iviking.org/mailman/listinfo/fx.php_list
> >>>
> >> _______________________________________________
> >> FX.php_List mailing list
> >> FX.php_List at mail.iviking.org
> >> http://www.iviking.org/mailman/listinfo/fx.php_list
> >
> > _______________________________________________
> > FX.php_List mailing list
> > FX.php_List at mail.iviking.org
> > http://www.iviking.org/mailman/listinfo/fx.php_list
>
> _______________________________________________
> FX.php_List mailing list
> FX.php_List at mail.iviking.org
> http://www.iviking.org/mailman/listinfo/fx.php_list
>


More information about the FX.php_List mailing list