Pareto Distribution

📈 Pareto Distribution in Python

The Pareto Distribution is a heavy-tailed continuous probability distribution often used in economics, finance, and social sciences.
It is famous for modeling wealth distribution, income, or city sizes, where a small number of occurrences account for the majority of the effect (the 80/20 rule).


✅ 1. Characteristics of Pareto Distribution

  • Continuous distribution

  • Parameters:

    • a → shape parameter (α > 0)

    • size → number of samples

  • Probability Density Function (PDF):

f(x;a)=axma/xa+1,x≥xmf(x; a) = a x_m^a / x^{a+1}, \quad x \ge x_m

  • x_m → minimum possible value (default in NumPy = 1)

  • Heavy-tailed (large values are possible)

Applications:

  • Wealth or income distribution

  • File sizes in the internet

  • Population sizes of cities


✅ 2. Generate Pareto Data Using NumPy

import numpy as np

# Parameters
a = 3 # shape parameter
size = 1000 # number of samples

# Generate Pareto random numbers
data = np.random.pareto(a, size)

print(data[:10])

Output (example):

[0.23, 0.78, 1.45, 0.56, 2.34, 0.12, 0.89, 1.67, 0.34, 3.12]
  • Values ≥ 0

  • Can shift using x_m + data if minimum value other than 1 is required


✅ 3. Visualize Pareto Distribution

import matplotlib.pyplot as plt
import seaborn as sns

sns.histplot(data, bins=50, kde=True, color='purple')
plt.title("Pareto Distribution (a=3)")
plt.xlabel("Value")
plt.ylabel("Frequency")
plt.show()

  • Histogram is right-skewed

  • Few very large values → heavy tail


✅ 4. Shifted Pareto Distribution

xm = 1 # minimum value
shifted_data = xm + data

sns.histplot(shifted_data, bins=50, kde=True, color='orange')
plt.title("Shifted Pareto Distribution (xm=1, a=3)")
plt.show()

  • Shift ensures minimum value ≥ xm


✅ 5. Compare Different Shape Parameters

data1 = np.random.pareto(a=2, size=1000)
data2 = np.random.pareto(a=5, size=1000)

sns.histplot(data1, bins=50, color='red', label='a=2', kde=True, alpha=0.5)
sns.histplot(data2, bins=50, color='blue', label='a=5', kde=True, alpha=0.5)
plt.title("Pareto Distribution Comparison")
plt.legend()
plt.show()

  • Smaller a → heavier tail (more extreme values)

  • Larger a → distribution more concentrated near 0


🧠 Summary Table

Function Parameters Description
np.random.pareto() a, size Generates Pareto random numbers
a Shape parameter Controls tail heaviness
size Number of samples Output array size
xm Minimum value Can shift data by adding xm

🎯 Practice Exercises

  1. Generate 1000 Pareto random numbers with a=2 and plot histogram with KDE.

  2. Compare Pareto distributions for a=2 vs a=5.

  3. Shift Pareto data to have minimum value 10 and visualize.

CodeCapsule

Sanjit Sinha — Web Developer | PHP • Laravel • CodeIgniter • MySQL • Bootstrap Founder, CodeCapsule — Student projects & practical coding guides. Email: info@codecapsule.in • Website: CodeCapsule.in

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *