# Fill in the line below: get names of columns with missing values
cols_with_missing = [col for col in X_train.columns
if X_train[col].isnull().any()]
# Fill in the lines below: drop columns in training and validation data
reduced_X_train = X_train.drop(cols_with_missing, axis=1)
reduced_X_valid = X_valid.drop(cols_with_missing, axis=1)
Showing posts with label Python. Show all posts
Showing posts with label Python. Show all posts
Sunday, 9 May 2021
get names of columns with missing values in ML
Friday, 27 November 2020
Imputer in Python
#Check for null values in the data-set. If any, impute those missing values using simple imputer. (Hint: Use 'median' to impute)find out the mean of 'enginesize' after imputingimport pandas as pd import numpy as np # Importing the SimpleImputer class from sklearn.impute import SimpleImputer imputer = SimpleImputer(missing_values=np.NaN, strategy='median') df.enginesize = imputer.fit_transform(df['enginesize'].values.reshape(-1,1))[:,0] df['enginesize'].mean()
Thursday, 22 October 2020
Different type of Calculation in Python using Numpy and Pandas Library
import pandas as pd import numpy as np from scipy.stats import kurtosis from scipy.stats import skew my_array = [] a = int(input("Size of array:")) #size of array for i in range(a): my_array.append(int(input("Element : "))) #appending to array my_array = np.array(my_array) #storing into a numpy array print(list(my_array)) x = (np.round(my_array,3)) #making to 3 decimal values y = skew(x) #calculating Skewness value print(round(y,3)) z = kurtosis(x) #calculating kurtosis value print(round(z,3)) print(np.var(list(x))) #calculating variance print(round(np.std(x),3)) #calculating std deviation
Output :Size of array:5 Element : 24 Element : 567 Element : 70 Element : 4 Element : 45 [24, 567, 70, 4, 45] 1.461 0.197 45637.2 213.629
Friday, 16 October 2020
Different libraries in Python
Python’sstatistics
is a built-in Python library for descriptive statistics. You can use it if your datasets are not too large or if you can’t rely on importing other libraries.SciPy is a third-party library for scientific computing based on NumPy. It offers additional functionality compared to NumPy, includingscipy.stats
for statistical analysis.Matplotlib is a third-party library for data visualization. It works well in combination with NumPy, SciPy, and Pandas.
Subscribe to:
Posts (Atom)