Working with Pagination, Infinite Scroll and JavaScript in Kimono

Shaumik Daityari
Share

Back in July 2014, I published the Web Scraper’s guide to Kimono. Being a web scraper myself, I had concluded that YC-funded Kimono had definitely done a commendable job, but I was also critical of its shortcomings — most importantly, the limitations of pagination. I have been closely following the developments at Kimono Labs since.

Over the last month in particular, they have released quite a few new features, some of them focussing on the pagination issues I raised in my earlier post. Therefore, I decided to revisit Kimono and look at these new enhancements.

JavaScript modifications to API results

The first feature we’re going to look at is the addition of a JavaScript API. It’s recently been added as an experimental feature that lets you modify the results after extracting them through the API you created.

It has a lot of potential applications. Often, it’s seen that the data that we extract from webpages are not in our desired format and post-processing is required to polish the data. By enabling us to create JavaScript functions that modify the extracted data, Kimono lets us make the process faster by eliminating the need for any fine-tuning of data.

To demonstrate such a feature, let’s select the old API we created from the list of APIs available to you after logging into your account. In the Modify Results tab, you can see a text box where you can enter your JavaScript function that will be applied to your results.

JS modify API

The mini editor that is provided is quite helpful, as it tells you in real time if there are errors in your code. Let us try to append the total number of results

function transform(data) {    
  var total_count = 0,
    length = data.results.length;

  for (var i = 0; i < length; i++) {
    total_count += data.results[i].length;
  }

  data.totalCount = total_count;

  return data;
}

The best part about this enhancement is that it is not automatically applied on your data every time you call the API. To apply the modify function to the endpoint, you need to modify your API call. For instance, if your API call was the following —

https://kimonolabs.com/api/[API_URL]?apikey=[yourAPIkey]

You need to add a GET parameter kimmodify and set it to 1. Your modified URL looks something like this.

https://kimonolabs.com/api/[API_URL]?apikey=[yourAPIkey]&kimmodify=1

Kimono also allows you send custom variables to this function using GET variables. For instance, if you need to send the parameter myVar with the value test, you should modify your URL as follows —

http://www.kimonolabs.com/apis/<API_ID>/?apikey=<API_KEY>&myVar=test

This variable is available with global scope in your function. It can be accessed as shown below.

query.myVar == 'test'; // true

For a detailed post on the JavaScript API, you may refer to the blog post from Kimono.

Infinite Scroll

The last time I reviewed Kimono, there was no way it could scrape data off pages with infinite scroll. However, Kimono recently announced a new feature that does exactly that. Let’s see how we can utilize the feature by scraping the Twitter feed of Domino’s India.

When you are creating the API though the interactive Kimonofy bookmarklet, notice the button with the infinity symbol on the panel of buttons on the top right.

Infinite scroll button

If you click on the infinite scroll button, Kimono lets you specify how many items need to be encountered before completion. We’ll go with just 50 items.

Setting infinite scroll options

After saving the API, we run it to see the results.

Results of infinite scroll crawl

It looks like the data was extracted perfectly. The addition of the infinite scroll feature is a very significant one, since a large number of websites nowadays use the feature.

Enhanced Pagination

Yet another type of infinite scroll convention is providing a “View More” button to add more results to the page. Thankfully, Kimono has come up with another method to extract data off such websites.

While creating an API, you usually highlight the next page link after selecting the pagination button of Kimono on the top. For enhanced pagination, you need to select the “View More” button that loads new posts.

Set up pagination

After saving the API and running it, the results seem satisfactory as they get off the desired data from the web page.

Pagination results

Final Thoughts

Back in July, Kimono was relatively new and the only projects in the Kimono showcase were simple pages which reported the FIFA World Cup scores. It’s definitely come a long way as people use Kimono to manage squads in their fantasy football league teams.

Kimono has also been used with MonkeyLearn to perform sentiment analysis with the objective to understand and analyze hotel reviews. Furthermore, the data generated through Kimono has also been used to visualize a year of bike rides.

The bottom line is very clear — with the kind of advancement Kimono has made in the last few months and the way people have reacted to it, it’s definitely the next big thing. The question is, what will you use it for?

Have you given Kimono a go? How do you use it?