Google docsHow toScrapingTechnical SEOTools

Bulk ImportXml tool & source (Google docs spreadsheets)

By March 29, 2013 No Comments

There’s been a few of you requesting a way to bypass the 50 importxml limit in Google docs so I’ve decided to release something publicly.

Click here to view the spreadsheet

Just make sure to sign in, then make a copy, then press the run button once to authorize the script. If the script doesn’t run, or isn’t there, see the section below.

How does it work?

Update: July 2019 – I barely remember why this hack works, but the sheet went down and I had a few requests to recreate it, so that’s what I did.

The only way I could do this efficiently was to use a script to set up the ImportXml formula in the sheet. This means that I was never able to call importxml with the Sheet class, setFormula method and then replace the formula fast enough. Even if I did manage to copyvalues and clear the importxml formula from the cell, it would either timeout, result in errors or very rarely…work.

Another fun issue was that Google docs would store the results for importxml in cache, but would display N/A# when I ran through one loop.

The script isn’t authorizing, or it’s not there!

Yep, that can happen – here is the source code.

function bulkXml() {

  var sheet = SpreadsheetApp.getActiveSheet();
  var lastRow = sheet.getLastRow()
  var Num = sheet.getRange(2,1,lastRow,1).getValues();
  
  Num = Num.filter(function(u) { 
   return u[0].length > 2
  })

  for (var y=0;y<2;y++) {
    for (var x=2;x-2 < Num.length;x++)  {
      var url = sheet.getRange(x,1).getValue();
      sheet.getRange(2,6).setValue(url);
      var xpathResult = sheet.getRange(3,6).getValue();    
      var counter = x - 1;
      sheet.getRange("C4").setValue(" PLEASE WAIT...CURRENTLY FETCHING " + counter + " OUT OF " + Num.length);
      
      if (y===1){
        sheet.getRange(x,2).setValue(xpathResult);
        sheet.getRange("C4").setValue("PROCESSED " + counter + " OUT OF " + Num.length);
        SpreadsheetApp.flush();
      }   
    }    
  } 
}

Click on Tools > Script editor and copy paste into there. Make sure you save the script and then you should be good to go.

When I click on the button nothing happens!

I’ve assigned scripts to the buttons, but they sometimes get lost when you make a copy of the Google doc

Right click on the Run button, in the top right you’ll see a drop down arrow. Select assign script, then enter: bulkXml

Click here to view the spreadsheet

Dave

Author Dave

More posts by Dave