Controlling Matplotlib Ticks Frequency Using XTicks and YTicks

matplotlib axis tick frequency control

Plotting data in Python is easy when using Matplotlib. Plotted figures will often reflect automatically-determined axis markers (a.k.a. tick marks) based on values passed from datasets. To limit the number of ticks or control their frequency, some explicit actions must be taken

Matplotlib is the defacto data visualization library for Python. It provides user-friendly, high-level APIs for creating such data visualizations as scatter plots, bar charts, histograms, and even more nuanced plots such as contour maps and triangular interpolation plots.

Sample Data via Random Number Generation

random numbers matplotlib
Randomly-generated data plotted using the matplotlib.pyplot.scatter class

To get started we’re going to generate the random data shown in the image above and plot it using the matplotlib.pyplot.scatter class. This plot reflects an x-axis tick amount of every other value within the x-axis min, max range (even number from 0-10). The following code will accomplish this:

import matplotlib.pyplot as plt
import random

# Generate 100 random x-values between 0 and 10, inclusive
x = [random.choice(list(range(11))) for _ in range(100)]

# Generate 100 random y-values between 0 and 100, inclusive
y = [random.choice(list(range(101))) for _ in range(100)]

# Create the scatter plot
plt.scatter(x=x, y=y)

# Show the plot

This data reflects a pretty even distribution and doesn’t do well to illustrate the need for limiting axis ticks or axis tick frequency. Let’s add an annoying outlier by the following: x.append(512); y.append(10) and see how the plot is affected:

outlier plot matplotlib
This plot reflects a single outlier x-axis value that skews the balance of data and greatly impacts the x-axis tick count

Can you read those scrunched-up tick values on the x-axis? While this is a very contrived example, I’ve often run into this issue when plotting a range of data such as product prices—because there’s always those geniuses on marketplaces that believe a 100x markup will fool someone. Who knows, maybe they’re right!

Control Tick Mark Frequency

matplotlib tick mark frequency
This plot has the x-axis ticks limited to every 100

Here we seen the outlier point, plotted in the lower right hand of the figure at (1024, 10). However, this figure shows x-axis tick marks every 100 values which makes it much more reader-friendly. To achieve this effect, implement the following line of code:

plt.xticks(range(0, int(max(x)), 100))

This makes use of the xticks function that either gets or sets the tick locations and labels of a given axis. There are xticks and yticks that can be easily accessed. Either the xticks or yticks functions take the following arguments:

  • ticks – an array-like object of xtick locations
  • labels – an array-like object of ytick locations
  • .Text properties to control display options of labels (as **kwargs)

The important thing to note here is that the tick values are explicitly definedIn this case, the tick marks are generated via a range(0, 512, 100)equivalent argument. This means a range between 0-512 at a frequency of every 100. Read the Python documentation for range for a better explanation of this functionality.

What this really means is that you’ll end up with a nonsensical plot if you don’t ensure a sensible relationship between your data and the ticks argument. Kind of like the one below, which used a range(0, 10, 1) argument for xticks.

matplotlib xticks bad range
x ticks range from 0-10 and are present for every 1 value

Final Thoughts

Control axis tick frequency in matplotlib can help better visualize distributions of data. However, there’s plenty of times where there exist too many values to display, hinting at the need to limit their display. The xticks and yticks functions fit the bill here, as shown in the figures above, but require some premeditation to ensure sensible display.

In the examples shown here, especially the one with the outlier datapoint, other approaches such as data sanitization, preprocessing, and filtering may prove beneficial. For example, I’d likely strike the (1024, 10) point from the record before plotting. Well, after plotting once anyway!

A common alternative to using the range(min, max, freq) specification is to use them numpy.arange(min, max+1, 1.0) which is a functional equivalent. The standard library’s range function was used here purely out of preference and familiarity. As usual, programming in Python makes this whole process a breeze.

Zαck West
Full-Stack Software Engineer with 10+ years of experience. Expertise in developing distributed systems, implementing object-oriented models with a focus on semantic clarity, driving development with TDD, enhancing interfaces through thoughtful visual design, and developing deep learning agents.