dfmeta: fixed check possible nan doesn’t detect spaces correctly, possible dup lev now ignore pure digits (so that won’t detect like 600 and 601 in object columns)
plot_1var_series: fixed can’t plot for all nulls
0.9.0 (20190912)
fit: deal with regression problems now
0.8.3 (20190910)
plot_1var_series: removed xlabel for float and int when in dfmeta
dfmeta: fixed unique counts colors inaccurate
0.8.1 (20190907)
plot_1var_series: changed rotation from 20 to 30
0.8.0 (20190907)
dfmeta:
now handle different dypes (int16, uint16 etc.) colors correctly
possible dup lev will truncate result longer than 1000 now, moved possible_dup_lev as a global function so that users can call for truncated result
minor fix in possible NaNs
plot_1var_series:
fixed labels got truncated in dfmeta view
rotate labels for category and object by 20 degree
truncate labels longer than 20
0.7.9 (20190906)
plot_1var_series: added </img> tag for returning html code
0.7.8 (20190906)
ct: fixed s1 name and s2 name didn’t display correctly, also when using conditions
0.7.7 (20190905)
tons of update
print_html_standard and dfmeta_to_htmlfile_standard got deprecated.
dfmeta now includes plot in it
0.6.2 (20190831)
new:
set_relation: a function to detect set relationship between two series (e.g., intersection, union, difference) and plot a bar plot
correspondence: a function to detect if two series (of same length) is 1-1 correspondence, or 1-m, m-1, m-m
auto_set_dtypes: can use column names in set_object etc., e.g., set_int=[‘column1’, ‘column3’]
dfmeta
unique count now color 100% (all unique) as blue, 1 (everything is the same) as red
fix: summary will count nan (dropna=False)
won’t check possible nan for int and float now so that 0’s won’t appear for int and float
0.6.1 (20190829)
ct: displays columns name for s2
some minor fixs
0.6.0 (20190827)
auto_set_dtypes: now can suggest potential categories for integers
dfmeta: possible nan now prints out observation of potential nans
0.5.9 (20190826)
bug fixed: nan in feature importance in dfmeta
0.5.8 (20190826)
bug fixed.
0.5.7 (20190826)
fit function now takes return_agg_feat_imp, which will return feature importance. It can then be passed to dfmeta to become a new column, and the top 3 features will be in bold.
fit now has a mean feature importance from abs max coef plot (in grey) which aggregate all coefs of different classes together by taking abs max.