Last Updated on
- dfmeta: fixed check possible nan doesn’t detect spaces correctly, possible dup lev now ignore pure digits (so that won’t detect like 600 and 601 in object columns)
- plot_1var_series: fixed can’t plot for all nulls
- fit: deal with regression problems now
- plot_1var_series: removed xlabel for float and int when in dfmeta
- dfmeta: fixed unique counts colors inaccurate
- plot_1var_series: changed rotation from 20 to 30
- now handle different dypes (int16, uint16 etc.) colors correctly
- possible dup lev will truncate result longer than 1000 now, moved possible_dup_lev as a global function so that users can call for truncated result
- minor fix in possible NaNs
- fixed labels got truncated in dfmeta view
- rotate labels for category and object by 20 degree
- truncate labels longer than 20
- plot_1var_series: added </img> tag for returning html code
- ct: fixed s1 name and s2 name didn’t display correctly, also when using conditions
- tons of update
- print_html_standard and dfmeta_to_htmlfile_standard got deprecated.
- dfmeta now includes plot in it
- set_relation: a function to detect set relationship between two series (e.g., intersection, union, difference) and plot a bar plot
- correspondence: a function to detect if two series (of same length) is 1-1 correspondence, or 1-m, m-1, m-m
- see documentation for more info
- auto_set_dtypes: can use column names in set_object etc., e.g., set_int=[‘column1’, ‘column3’]
- unique count now color 100% (all unique) as blue, 1 (everything is the same) as red
- fix: summary will count nan (dropna=False)
- won’t check possible nan for int and float now so that 0’s won’t appear for int and float
- ct: displays columns name for s2
- some minor fixs
- auto_set_dtypes: now can suggest potential categories for integers
- dfmeta: possible nan now prints out observation of potential nans
- bug fixed: nan in feature importance in dfmeta
- bug fixed.
- fit function now takes return_agg_feat_imp, which will return feature importance. It can then be passed to dfmeta to become a new column, and the top 3 features will be in bold.
- fit now has a mean feature importance from abs max coef plot (in grey) which aggregate all coefs of different classes together by taking abs max.