Imagine when you are shopping on Amazon, a list of 50 items is displayed after a search. You scroll down, click an item, continue scrolling, and click on a few more. How does an analyst know if an item has been displayed on the screen to calculate the click rate (clicked/displayed)? How do they know if you saw only 15, or 20, or all 50 items? Is there a scientific way to estimate the furthest point you scrolled base on your clicks, and therefore how many items were actually displayed? It turns out this is “The German Tank Problem”.
Cognitive Minimalist
In this post, I wanted to quickly introduce an idea that I haven’t seen anywhere else. It may be obvious to some people, although this is an simple idea some might still be benefited from it.
Most of the minimalists dedicate to reduce the number of objects they own, or some similar metrics such as the amount of money spent or the space of their apartment is – physical entities. However, I’m proposing idea of “cognitive minimalist” which is to reduce the amount needed for cognition. In other words, mental cost, psychological effort, or cognitive resource etc.
Continue reading Cognitive MinimalistTEF Quick Start with Titanic
Installation
!pip install TEF -U
import TEF
TEF.__version__
'0.7.7'
Continue reading TEF Quick Start with Titanic Microsoft Surface Pro 9 i7 Review
So I finally decided to upgrade my Surface Pro 4 (M3), bought in 2018. This time, I chose the highest trim – i7 with 32 GB RAM. Obviously, not every product is perfect and since this is not a cheap purchase, I thought people might be interested to see some real review. Here it is.
Continue reading Microsoft Surface Pro 9 i7 ReviewHow To Fit A Machine Learning Model To A Kaggle Dataset In 8 Lines
import pandas as pd
train_transaction_raw = pd.read_csv('data/ieee-fraud-detection.zip Folder/train_transaction.csv')
import TEF
train_transaction = TEF.auto_set_dtypes(train_transaction_raw, set_object=[0])
TEF.dfmeta(train_transaction)
TEF.plot_1var(train_transaction)
TEF.plot_1var_by_cat_y(train_transaction, 'isFraud')
TEF.fit(train_transaction, 'isFraud', verbose=2)
Disclaimer and Caveat
Every ML practitioner knows it is a risky behavior to fit a model without understanding the data. The purpose of this article is to introduce the universal usage of TEF only instead of detailed exploration. Within these code, we can only have a rough understanding about the dataset.
In the following section I will walk through these codes for this ieee fraud detection dataset. A more detailed exploration, feature engineering, and model selection may be published in the future.
Continue reading How To Fit A Machine Learning Model To A Kaggle Dataset In 8 Lines