Subscribe:
RSS feed
In web applications, it is a very common task to sort some set of items according to the user-selected criteria and return only the first or N-th page of the sorted result. The page size can be much less than the total number of items, hence it is typically not reasonable to sort the entire set and […]
January 1, 2012
Greenplum Database is an interesting solution for data mining and data warehousing. In this post I focus on MapReduce capabilities of Greenplum 4.1 and try to figure out how efficient its implementation is. Simple MapReduce Job Let us consider a simplified version of one real life problem that is typically solved using MapReduce technique – analysis […]
January 2, 2012
3