LECTURE 6
전체 글
관련 문서
Communication cost = input file size + 2 × (sum of the sizes of all files passed from Map processes to Reduce processes) + the sum of the output sizes of the Reduce
Because output is spread across R files (each reduce task creates one file in DFS).. Task
Step 2: label each node by the # of shortest paths from the root E..
Data stream consists of a universe of elements chosen from a set of size N. Maintain a count of the number of distinct elements
In fact, the minimum solution is given by y = 1 vector (the smallest eigenvector w/ eigenvalue 0); however, this does not say anything about the partition. Thus, we find
Main idea: Recommend items to customer x similar to previous items rated highly by x. Andy enjoyed
New Problem: Given a stream, how can we find recent frequent items (= which appear more. than s times in
The Google solution for spider traps: At each time step, the random surfer has two options!.