#Check for null values in the data-set. If any, impute those missing values using simple imputer. (Hint: Use 'median' to impute)find out the mean of 'enginesize' after imputingimport pandas as pd import numpy as np # Importing the SimpleImputer class from sklearn.impute import SimpleImputer imputer = SimpleImputer(missing_values=np.NaN, strategy='median') df.enginesize = imputer.fit_transform(df['enginesize'].values.reshape(-1,1))[:,0] df['enginesize'].mean()
Friday, 27 November 2020
Imputer in Python
Thursday, 22 October 2020
Different type of Calculation in Python using Numpy and Pandas Library
import pandas as pd import numpy as np from scipy.stats import kurtosis from scipy.stats import skew my_array = [] a = int(input("Size of array:")) #size of array for i in range(a): my_array.append(int(input("Element : "))) #appending to array my_array = np.array(my_array) #storing into a numpy array print(list(my_array)) x = (np.round(my_array,3)) #making to 3 decimal values y = skew(x) #calculating Skewness value print(round(y,3)) z = kurtosis(x) #calculating kurtosis value print(round(z,3)) print(np.var(list(x))) #calculating variance print(round(np.std(x),3)) #calculating std deviation
Output :Size of array:5 Element : 24 Element : 567 Element : 70 Element : 4 Element : 45 [24, 567, 70, 4, 45] 1.461 0.197 45637.2 213.629
Friday, 16 October 2020
Different libraries in Python
Python’sstatistics
is a built-in Python library for descriptive statistics. You can use it if your datasets are not too large or if you can’t rely on importing other libraries.SciPy is a third-party library for scientific computing based on NumPy. It offers additional functionality compared to NumPy, includingscipy.stats
for statistical analysis.Matplotlib is a third-party library for data visualization. It works well in combination with NumPy, SciPy, and Pandas.
Wednesday, 9 September 2020
Subscribe to:
Posts (Atom)