from sklearn.datasets import california_housing dt = california_housing.fetch_california_housing() df = pd.DataFrame(dt['data'], columns=dt['feature_names'])
no_missing_values
no_missing_values(dataset:DataFrame, missing_threshold:float=0.6)
dataset
DataFrame
missing_threshold
float
0.6
Find the features with a fraction of missing values above missing_threshold
df2 = no_missing_values(df)
0 features with greater than 60.0% missing values.
plot_hist
plot_hist(df:DataFrame, feat2show:List[str]=None)
df
feat2show
List
str
None
Plot histograms on columns feat2show default to plot all columns One historygram for each column
plot_hist(df)