Note: In this article we discuss three things: 1. math Module 2. statistics Module 3. Descriptive statistics using Pandas, NumPy, SciPy and StatsModels
Python math Module
Python has a built-in module that you can use for mathematical tasks.
The math
module has a set of methods and constants.
Math Methods
Method | Description |
---|---|
math.acos() | Returns the arc cosine of a number |
math.acosh() | Returns the inverse hyperbolic cosine of a number |
math.asin() | Returns the arc sine of a number |
math.asinh() | Returns the inverse hyperbolic sine of a number |
math.atan() | Returns the arc tangent of a number in radians |
math.atan2() | Returns the arc tangent of y/x in radians |
math.atanh() | Returns the inverse hyperbolic tangent of a number |
math.ceil() | Rounds a number up to the nearest integer |
math.comb() | Returns the number of ways to choose k items from n items without repetition and order |
math.copysign() | Returns a float consisting of the value of the first parameter and the sign of the second parameter |
math.cos() | Returns the cosine of a number |
math.cosh() | Returns the hyperbolic cosine of a number |
math.degrees() | Converts an angle from radians to degrees |
math.dist() | Returns the Euclidean distance between two points (p and q), where p and q are the coordinates of that point |
math.erf() | Returns the error function of a number |
math.erfc() | Returns the complementary error function of a number |
math.exp() | Returns E raised to the power of x |
math.expm1() | Returns Ex - 1 |
math.fabs() | Returns the absolute value of a number |
math.factorial() | Returns the factorial of a number |
math.floor() | Rounds a number down to the nearest integer |
math.fmod() | Returns the remainder of x/y |
math.frexp() | Returns the mantissa and the exponent, of a specified number |
math.fsum() | Returns the sum of all items in any iterable (tuples, arrays, lists, etc.) |
math.gamma() | Returns the gamma function at x |
math.gcd() | Returns the greatest common divisor of two integers |
math.hypot() | Returns the Euclidean norm |
math.isclose() | Checks whether two values are close to each other, or not |
math.isfinite() | Checks whether a number is finite or not |
math.isinf() | Checks whether a number is infinite or not |
math.isnan() | Checks whether a value is NaN (not a number) or not |
math.isqrt() | Rounds a square root number downwards to the nearest integer |
math.ldexp() | Returns the inverse of math.frexp() which is x * (2**i) of the given numbers x and i |
math.lgamma() | Returns the log gamma value of x |
math.log() | Returns the natural logarithm of a number, or the logarithm of number to base |
math.log10() | Returns the base-10 logarithm of x |
math.log1p() | Returns the natural logarithm of 1+x |
math.log2() | Returns the base-2 logarithm of x |
math.perm() | Returns the number of ways to choose k items from n items with order and without repetition |
math.pow() | Returns the value of x to the power of y |
math.prod() | Returns the product of all the elements in an iterable |
math.radians() | Converts a degree value into radians |
math.remainder() | Returns the closest value that can make numerator completely divisible by the denominator |
math.sin() | Returns the sine of a number |
math.sinh() | Returns the hyperbolic sine of a number |
math.sqrt() | Returns the square root of a number |
math.tan() | Returns the tangent of a number |
math.tanh() | Returns the hyperbolic tangent of a number |
math.trunc() | Returns the truncated integer parts of a number |
Math Constants
Constant | Description |
---|---|
math.e | Returns Euler's number (2.7182...) |
math.inf | Returns a floating-point positive infinity |
math.nan | Returns a floating-point NaN (Not a Number) value |
math.pi | Returns PI (3.1415...) |
math.tau | Returns tau (6.2831...) |
Some of these methods have been seen very frequently in our work. These include: math.ceil(): Rounds a number up to the nearest integer math.floor(): Rounds a number down to the nearest integer math.factorial(): Returns the factorial of a number math.comb(): Returns the number of ways to choose k items from n items without repetition and order math.degrees(): Converts an angle from radians to degrees math.radians(): Converts a degree value into radians math.gcd(): Returns the greatest common divisor of two integers math.dist(): Returns the Euclidean distance between two points (p and q), where p and q are the coordinates of that point
Python statistics Module
Averages and measures of central location
These functions calculate an average or typical value from a population or sample.
|
Arithmetic mean (“average”) of data. |
|
Fast, floating point arithmetic mean, with optional weighting. |
|
Geometric mean of data. |
|
Harmonic mean of data. |
|
Median (middle value) of data. |
|
Low median of data. |
|
High median of data. |
|
Median, or 50th percentile, of grouped data. |
|
Single mode (most common value) of discrete or nominal data. |
|
List of modes (most common values) of discrete or nominal data. |
|
Divide data into intervals with equal probability. |
Measures of spread
These functions calculate a measure of how much the population or sample tends to deviate from the typical or average values.
|
Population standard deviation of data. |
|
Population variance of data. |
|
Sample standard deviation of data. |
|
Sample variance of data. |
Statistics for relations between two inputs
These functions calculate statistics regarding relations between two inputs.
|
Sample covariance for two variables. |
|
Pearson's correlation coefficient for two variables. |
|
Slope and intercept for simple linear regression. |
NormalDist
NormalDist is a tool for creating and manipulating normal distributions of a random variable. It is a class that treats the mean and standard deviation of data measurements as a single entity. Normal distributions arise from the Central Limit Theorem and have a wide range of applications in statistics.
l = [13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30, 33, 33, 35, 35, 35, 36, 40, 45, 46, 52, 70] # Sum of all elements print(sum(l)) # Count of each items. from collections import Counter print(Counter(l)) # Mean import statistics as st print(st.mean(l)) print("Median:", st.median(l)) # Mode print(st.mode(l)) # Mid-range print(st.mean([max(l), min(l)])) # Other statistical measures print(st.quantiles(data = l, n = 4)) # [20.0, 25.0, 35.25] print(st.stdev(l)) print(st.variance(l)) import pandas l = [13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30, 33, 33, 35, 35, 35, 36, 40, 45, 46, 52, 70] df = pandas.DataFrame(l, columns=['Numbers']) sum = df['Numbers'].sum() count_val = df['Numbers'].value_counts() mode = df['Numbers'].mode().values.tolist() midrange = (df['Numbers'].max() + df['Numbers'].min()) / 2 print("The sum of the given data using pandas ", sum) print("\nThe count of values \n", count_val) print("\nMean of the given data using pandas ", df['Numbers'].mean()) print("\nMedian of the given data using pandas ", df['Numbers'].median()) print("\nMode of the given data using pandas ", mode[0]) print("\nMidrange of the given data using pandas:", midrange) print("\nStandard deviation for given data using pandas:", df['Numbers'].std()) print("\nVariance for given data using pandas:", df['Numbers'].var()) print("\nQuantiles\n", df['Numbers'].quantile([0.25,0.50,0.75])) print("\n\n") import numpy as np data=np.array(l) print("Using NumPy\n") unique_values, counts = np.unique(data, return_counts=True) quantiles=np.percentile(data,[25,50,75]) print("Sum ",np.sum(data)) print("\nCount of values \n") for value, count in zip(unique_values, counts): print( value, count) print("Mean :",np.mean(data)) print("\nMedian:",np.median(data)) print("\nMode:",np.argmax(np.bincount(data))) print("\nStandard deviation",np.std(data)) print("\nVariance :",np.var(data)) print("\nQuantiles \n") print(quantiles[0],quantiles[1],quantiles[2]) print("\n\n") from scipy import stats print("Using SciPy\n") mode=stats.mode(data) print("Mode: ", mode.mode[0]) # For count--> scipy.stats.itemfreq() # Other statistical measures similar to numpy $ python statistical_summary.py 774 Counter({25: 4, 35: 3, 16: 2, 20: 2, 22: 2, 33: 2, 13: 1, 15: 1, 19: 1, 21: 1, 30: 1, 36: 1, 40: 1, 45: 1, 46: 1, 52: 1, 70: 1}) 29.76923076923077 Median: 25.0 25 41.5 [20.0, 25.0, 35.25] 13.158442741624686 173.14461538461538 The sum of the given data using pandas 774 The count of values 25 4 35 3 16 2 20 2 22 2 33 2 13 1 40 1 52 1 46 1 45 1 30 1 36 1 15 1 21 1 19 1 70 1 Name: Numbers, dtype: int64 Mean of the given data using pandas 29.76923076923077 Median of the given data using pandas 25.0 Mode of the given data using pandas 25 Midrange of the given data using pandas: 41.5 Standard deviation for given data using pandas: 13.158442741624686 Variance for given data using pandas: 173.14461538461538 Quantiles 0.25 20.25 0.50 25.00 0.75 35.00 Name: Numbers, dtype: float64 Using NumPy Sum 774 Count of values 13 1 15 1 16 2 19 1 20 2 21 1 22 2 25 4 30 1 33 2 35 3 36 1 40 1 45 1 46 1 52 1 70 1 Mean : 29.76923076923077 Median: 25.0 Mode: 25 Standard deviation 12.902914674622618 Variance : 166.4852071005917 Quantiles 20.25 25.0 35.0 Using SciPy /home/ashish/Desktop/statistical_summary.py:90: FutureWarning: Unlike other reduction functions (e.g. `skew`, `kurtosis`), the default behavior of `mode` typically preserves the axis it acts along. In SciPy 1.11.0, this behavior will change: the default value of `keepdims` will become False, the `axis` over which the statistic is taken will be eliminated, and the value None will no longer be accepted. Set `keepdims` to True or False to avoid this warning. mode=stats.mode(data) Mode: 25