Percentile Calculator
Value
40
How it works
Percentiles describe the relative position of a value within a dataset: the Pth percentile is the value below which P% of the observations fall. The 50th percentile (median) divides the dataset in half; the 90th percentile is exceeded by only 10% of observations.
**Calculation methods** There are multiple interpolation methods for percentiles (NumPy supports 13 distinct methods via the interpolation parameter). The most common: Linear interpolation method (Excel PERCENTILE.INC, NumPy default): L = (P/100) × (n−1); floor and ceil of L give the bounding indices; interpolate between them. Nearest rank method: rank = ceil(P/100 × n); return the value at that rank. The methods produce identical results for exact ranks and differ only at non-integer rank positions.
**Real-world applications** Web performance: page load times are non-normally distributed — P99 latency (the slowest 1% of requests) is the key SLO metric, not mean latency. API rate limiting: P95 request rate determines burst capacity needed. Salary benchmarking: P25/P50/P75 by job title and location. Student test scoring: percentile rank communicates position relative to peers better than raw scores.
**Percentile vs. percentage vs. percentile rank** Percentile is a value (the P90 latency is 850ms). Percentage is a ratio (85% of requests complete within 850ms). Percentile rank of a specific value is the percentage of data points below it. These are related but distinct concepts often confused in practice.
Frequently Asked Questions
- A percentage is a ratio expressed as parts per hundred (85% of students passed). A percentile is a position in a ranked distribution (scoring at the 85th percentile means scoring higher than 85% of all test-takers). Percentile rank of your score = the percentage of people who scored below you. Confusion: 'scoring 85%' (got 85% of answers right) is completely different from 'scoring at the 85th percentile' (scored better than 85% of test-takers).
- There are 13+ different interpolation methods for computing percentiles from discrete data. NumPy default (linear interpolation, method='linear'): interpolates between surrounding data points. Excel PERCENTILE.INC: equivalent to NumPy's linear. Excel PERCENTILE.EXC: excludes endpoints. R default: equivalent to NumPy type 7. The methods differ at non-exact ranks. For large datasets (n > 100), all methods converge to essentially the same values. Specify which method you used when reporting percentiles in research.
- P99 latency is the response time exceeded by only 1% of requests — your slowest users' experience. Average latency hides tail behavior: if 1% of users experience 10-second loads while 99% experience 50ms loads, the average might look fine at 150ms while many users are abandoning the site. SLOs (Service Level Objectives) are defined in percentiles: 'P99 < 200ms' is a meaningful performance target. Average latency is nearly useless for capacity planning and user experience optimization.
- A box plot (box-and-whisker plot) visualizes the percentile distribution: the box spans P25 to P75 (IQR), the line inside the box is the median (P50), and whiskers extend to P5/P95 or to 1.5×IQR (Tukey's convention). Points outside the whiskers are plotted as individual outliers. Box plots simultaneously show center (median), spread (IQR), symmetry (box position relative to median), and outliers — more informative than showing just mean ± SD.