Free Script: Multi Domain Backlink Finder (Google, Yahoo, Altavista, AllTheWeb)
|
| |
![]() | |
Hey everybody out there
I have a new script for you all. I’m not going to walk everyone through the whole thing, but I will post a demo function, and a link to a zip file with the source.
This program takes unlimited domains (obviously run time increases with each given domain, so mass is not the name of the game here), and displays a nice table of each domain, and how many backlinks each search engine found for that domain. It automatically checks domain.com, and www.domain.com
How to Use:
- Put the script on your server. God willing it works on both Linux and Windows(confirmed on Windows)
- MAKE SURE YOU HAVE CURL ENABLED IN YOUR PHP.INI FILE
- Go to the script in your web browser.
- Enter domains seperated by line
- Hit Submit and Enjoy.
My only request with this, is that if you do use it, and enjoy it, you give me a nice backlink for it ![]()
Oh yeah, and if you use it publically, or semi-publically, it IS mandatory that you give me a backlink. And yes, that includes if you use snippets and modify it. Obviously not enforceable, but come on now. I’ve given you guys a look at a nice gold mine on this blog.
Screen Shots(Click to Enlarge):
![]()
![]()
Download The Script: XMCP’s Multi-Domain Backlink Finder (click the link)
Looking at the Code:
I’m only going to examine one function here, because really, that’s all you need to understand to get the idea. It’s basically just an implementation of the CuRL script tutorial we did in a previous entry. Feel free to examine the code, although I didn’t comment it(oops!) maybe I will later.
function getYahooLinks($sURL)
{
$url=”https://siteexplorer.search.yahoo.com/advsearch?p=http%3A%2F%2F”.urlencode($sURL).”&bwm=i&bwmo=d&bwmf=u”;//this is our url we’re going to be checking
$useragent=”Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1″;//pretend we’re firefox
$ch = curl_init($url);//initialize curl
curl_setopt($ch, CURLOPT_USERAGENT, $useragent);//set the user-agent to firefox
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);//if this is true, nothing ever verifies
curl_setopt($ch, CURLOPT_RETURNTRANSFER ,1);//we want the response data
curl_setopt($ch, CURLOPT_COOKIEJAR, “./cookie-jar.txt”);//cookies
curl_setopt($ch, curlOPT_COOKIEFILE, “./cookie-jar.txt”);//cookies
curl_setopt($ch, CURLOPT_FOLLOWLOCATION ,1);//follow redirects
curl_setopt($ch, CURLOPT_HEADER, 1);//I really don’t care about the header, but why not
curl_setopt($ch, CURLOPT_AUTOREFERER, 0);//referers can only hurt us
$data=curl_exec($ch);//execute!
curl_close($ch);
$spl=explode(”</strong> of about <strong>”,$data);//this is in front of the data that we want
$spl2=explode(”</strong>”,$spl[1]);//this is behind it
$ret=trim($spl2[0]);//trim the pretty data
if(strlen($ret)==0)//if there was nothing there
{
return(0);//we want it to return 0 if the spot was empty
}
else
{
return($ret);//return our answer
}
}





















November 8th, 2007 at 3:40 am
How long will it take for each domain? Is it possible to check big lists like over 2000 names? Can be used with shared hosts like HostGator or Dreamhost without the account be banned? Thanks for your answers.
November 8th, 2007 at 10:23 am
It goes relatively fast.
The trick is that it checks both the normal domain, and the domain with http://www. appended on to it. For 5 entered domains(10 to check total), it took 30 seconds.
I’ve tested it with up to 400, which I entered, then just kind of walked away from the computer.
Not sure about hostgator/dreamhost, I’d ask their support. The only issue I can see is MAX_EXECUTION_TIME being exceeded.
I recommend using, on your home CPU,
XAMPP, or PHPerl, but the first one it is easier to install curl for.
November 11th, 2007 at 12:16 am
I tried your script, but I’m running into an issue with Google. It looks like they can detect the pattern and after doing a few URL, it returns an error page saying that you are a bot or infected or have spyware. Enter this captcha to continue.
Have you figured out a way to fool it so it doesn’t think you are are a bot?
November 11th, 2007 at 10:03 am
That, regrettably is google’s own limitation for querying too fast. I broke it up as much as possible. How many domains were you trying to do?
November 11th, 2007 at 10:49 am
I was trying to do about 20. I was looking into throttling it. I modified the script to write it out to a file. So now, I’m trying to figure out how to call it on a scheduled basis so it does a few domains, sleeps, does a few more. But haven’t figured out a clean way to call it automatically.
I’m trying to do about 500 domains overall as I’m trying to do some analysis on a specific segment of sites. I don’t care if it takes a couple days to accomplish, just trying to figure out how to get them all.
November 11th, 2007 at 10:50 am
Would you mind commenting on the url you chose for google? I notice that the numbers that come back are different than if you just do a link:somedomain.com.
Thanks
November 11th, 2007 at 10:54 am
Whoa I didn’t see how many comments this got.
2000 will result in a ban. I’done ok with 120. Havent tested higher.
Right now, Google seems to be throttling worse than normal. Hopefylly all will be well soon. Maybe when I wake up, I’ll build in proxy support. I’ll also answer the Google question then. (I’ve been up for 5 mintes)
November 11th, 2007 at 11:03 am
Do you know of a script or have you ever done one for MSN/Live search?
November 11th, 2007 at 11:53 am
I tested that MSN/Live, but it wasn’t returning backlinks for any sites at all, so I left it out.
Example: http://search.msn.com/results.aspx?q=link%3Awww.google.com&go=Search&form=QBRE
October 5th, 2008 at 10:26 am
Very cool script. Thanks very much!!!