Typically now webpages load and display tens of records at a time. I believe loading many thousands of records at a time will become more and more common with the aggregation and filtering then performed in the browser.
The Current Situation
Many websites have consist of simple drilldowns. For example on my online banking site I see a page with a summary of my recent recent transactions and links to load a new page with more detailed information. e.g. summary page -> individual statement page -> individual transaction page
More recently there has been a growing trend for single page applications - however the underlying services often still reflect the old model. For example a RESTful api to the banking example could consist of linked summary resources, statement resources and transactions resources with each drilldown causing another roundtrip to the browser.
This simple drilldown approach often makes sense and will never be fully replaced. However I believe there is an alternative approach that is beginning to make more sense - load all the data and perform the necessary aggregations and display filtering on the client. For example this crossfilter demo loads 250,000 rows of data and allows interactive filtering and drill down with instantaneous results once the data has been loaded.
The simple answer is that in-browser analytics with large client-side datasets can produce better a user experience.
In my online banking example if all transactions were loaded on the client then monthly statements are just an arbitrary choice. If I want to see what transactions I made on 10 day vacation I can do it instantly. Likewise how much money have ever spent in Starbucks - a near instant result is available.
In short - by having all the data available in-browser, all interactions with the data can become quicker and certain interactions that were previously impossible become possible. Consider watching your data dynamically change when moving a slider over a date range in comparison to the typical interaction of choosing a date and pressing submit.
Until now In-browser Analytics had many technical obstacles however a number of changes have now made this approach viable:
- Increased Connection Speeds. Downloading a Megabyte of data can be completed in seconds on broadband connections. A single megabyte of compressed data can store hundreds of thousands of rows of data. Utilising Local Storage with incremental updates can also help reduce load-times (and provide an offline capability for mobile users). In addition, from personal experience users are often happy to accept an initial few seconds delay if they have the knowledge that all interactions are lightening fast afterwards.
- Better Browser Visualisation. Frameworks like D3, Highcharts, Raphael can now routinely handle very large datasets see: http://www.highcharts.com/stock/demo/data-grouping.
- Better Data Handling. Traditionally all heavy analytics were performed server side and the results exposed with web services. Now there is a growing ecosystem of frameworks to handle large datasets client side. See crossfilter, gauss and jstat
Why Haven't We Seen More of This?
This approach is currently quite niche. Many of the components required are still relatively immature and limited to modern browsers and fast connections. Therefore I think In-Browser Analytics may initially find its most fertile ground for internal webapps. In my current role I've focussed on creating analytics websites for internal use only. Due to a managed browser ecosystem (Chrome only) and fast internal networks handling large datasets client-side is made a lot easier.
The approach is also best suited to datasets where some statistical summaries can be made. e.g. loading 50,000 book reviews in one go is unlikely to be useful as the user will never read the vast majority of them. This will limit its domain somewhat.
Join the revolution! Build some In-Browser Analytic Applications ( IBAA? - this really needs a better acronym...)