You have one file with (query, URL, score) comma separated row data in which for each URL, the score of its relevance to the given query is given. You have another file that has (URL, categories) info where categories are also comma separated and describe the URL content. Given a query, find the categories best describing its results.
search, www.google.com, 100
search, www.facebook.com, 20
social, www.google.com, 2
awesomeness, www.aminariana.com, 100
www.google.com, advertising, engineering, internet
www.facebook.com, advertising, php
- Writing an algorithm that outputs some meaningful results (25%)
- Being memory-efficient at scale (25%)
- Producing complete result coverage (25%)
- Producing percentages for category rankings (25%)
Sashi is from Google. He asked me this question in my interview.