Imagine when you are shopping on Amazon, a list of 50 items is displayed after a search. You scroll down, click an item, continue scrolling, and click on a few more. How does an analyst know if an item has been displayed on the screen to calculate the click rate (clicked/displayed)? How do they know if you saw only 15, or 20, or all 50 items? Is there a scientific way to estimate the furthest point you scrolled base on your clicks, and therefore how many items were actually displayed? It turns out this is “The German Tank Problem”.
Tag Archives: Statistics
Swear Words in Review: Regiospecificity and Predictability
Abstract
This report is aimed to answer the following two questions. 1. Does the use of swear words have any regiospecificity that result in heterogeneous in the data? 2. Does the use of swear words in customers’ review have an impact on the ratings they gave? Can it predict the stars they gave toward a business? Mainly, analysis using ANOVA on metropolis, multiple regression on ratings are performed. Results indicate that the usage of swear words is different by region and 25 of 45 swear words have predictability on the rating a customer gave. All code and files can be obtained from the link in the end.
Continue reading Swear Words in Review: Regiospecificity and Predictability
Data Science Capstone Quiz
Introduction
- All quiz questions are from Coursera Data Science Capstone course.
- All .json files are provided by Yelp.
- Data sources is hiden for privacy concern.