Calculating Moving Averages in Python

Need to smooth out noise in your time series data? Need to chart market trends? Python & Moving Averages are here to help!
Moving Averages in Python

Moving averages are momentum indicators used in a range of fields from natural sciences to stock market trading. These calculations measure momentum in observed values over a period of time. For example, the simple moving average can help signal trend reversals in the stock market.

Calculating the moving average in Python is simple enough and can be done via custom functions, a mixture of standard library functions, or via powerful third-party libraries such as Pandas. In this article, we’ll take a look at how to calculate some common moving averages in Python as well as how to chart them out using Plotly.

Highlights

  • Getting historical pricing data to use for visualization via the yfinance library
  • Two methods of calculating moving averages in Python
  • Using pandas_ta library to create groupings of technical indicators to apply at broader scales
  • Considerations for multiprocessing of large amounts of data and indicators like moving averages
  • Creating candlestick charts in Plotly with overlaid signal lines for simple and exponential moving averages
  • Interpreting the SMA and EMA crossover events as trading signals to indicate shifts in price momentum

Moving Averages 101

Before we get into how to calculate moving averages in Python let us first discuss what they are. Moving averages are measures of momentum over a series of observed values. These measures are commonly made across a subset of values within a larger set. This subset, known as the lookback period, offers different functional insights based on its value.

There are a number of common moving average indicators each of which has variations utilizing several common lookback periods. The following are the most common among stock-trading:

  • Simple Moving Average (SMA): Represents the mean value across a period of n-previous observations. Common lookback periods include 50, 100, and 200-period trailing values.
  • Weighted Moving Average (WMA): Represents a weighted mean across a period of n-pervious observations where each observation is given a different weight. Used as the basis for several other moving averages.
  • Exponential Moving Average (EMA): Represents a weighted mean across a period of n-previous observations where values closest to the most recent are given exponentially larger consideration.

Moving averages not only come in a range of different lookback window variations but can also be used in conjunction with other statistical methods. For example, technical analysts use the Bollinger Bands which incorporate a simple moving average that integrates a 9-21 day lookback window period. Check out our article on Moving Averages for more information.

Calculating Moving Averages in Python

Python has emerged as the leading programming language for all things data. This includes machine learning, statistics (sorry, R), and algorithmic trading. As with any language, Python can use native syntax to calculate moving averages.

These implementations can be tedious, under-optimized, and hard to scale across large datasets. Fortunately, libraries like Pandas make implementing technical indicators a breeze. Let’s take a look at two approaches, both using Pandas:

Method 1: DataFrames & Native Pandas Functions

Pandas is a powerful computing library. It comes with a lot of optimized functions to absolutely churn through data. It doesn’t offer explicit support for some more complex indicators but moving averages are well within its ability. Consider the following approach for calculating the simple moving average using Pandas:

# import yfinance to get pricing data
import yfinance as yf

# Get 1-yr price history for $NVDA
nvda = yf.Ticker('NVDA')
df = nvda.history(period='1y')[['Open', 'High', 'Low', 'Close', 'Volume']]  # return is pandas DataFrame object

# Result
                  Open        High         Low       Close    Volume
Date                                                                
2020-07-28  103.621322  103.698731  101.973248  102.035675  27163600
2020-07-29  103.786129  105.039660  103.349140  104.532753  28450800
2020-07-30  103.628814  106.105921  102.832245  106.016022  30888000
2020-07-31  105.509108  107.539235  104.208132  106.023506  38608000
2020-08-03  107.199633  110.857861  107.027333  109.973892  41272000
...                ...         ...         ...         ...       ...
2021-07-21  188.820007  195.270004  187.419998  194.100006  37101700
2021-07-22  196.419998  198.869995  192.759995  195.940002  32382600
2021-07-23  196.559998  197.000000  192.500000  195.580002  19542900
2021-07-26  193.110001  194.419998  189.139999  192.940002  20373800
2021-07-27  192.649994  196.199997  187.410004  192.080002  23994571

[252 rows x 5 columns]

What we’ve done here is use the yfinance library to download historic pricing data from the finance.yahoo.com public API server. The is returned as a pandas DataFrame object. Note that only the OHLCV columns were specified. By default, a splits and dividends column are also returned. Let’s take a look at how to apply a simple moving average to this data:

# Add a simple moving average
df['SMA_10'] = df['Close'].rolling(window=10).mean()

# print the first 15 rows of data
print(df.head(15))

                  Open        High  ...    Volume      SMA_10
Date                                ...                      
2020-07-28  103.621322  103.698731  ...  27163600         NaN
2020-07-29  103.786129  105.039660  ...  28450800         NaN
2020-07-30  103.628814  106.105921  ...  30888000         NaN
2020-07-31  105.509108  107.539235  ...  38608000         NaN
2020-08-03  107.199633  110.857861  ...  41272000         NaN
2020-08-04  110.370919  112.146339  ...  31033600         NaN
2020-08-05  112.308666  113.584670  ...  24992400         NaN
2020-08-06  113.364923  113.447331  ...  24431600         NaN
2020-08-07  112.992859  114.913114  ...  34251600         NaN
2020-08-10  113.210097  113.949231  ...  42779600  109.007021
2020-08-11  110.608158  111.237419  ...  35451200  109.640780
2020-08-12  109.779123  114.536057  ...  46441200  110.614391
2020-08-13  115.325137  117.080583  ...  37446000  111.442423
2020-08-14  115.165314  116.910770  ...  36643600  112.390564
2020-08-17  118.374052  123.952535  ...  62130000  113.715763

[15 rows x 6 columns]

In the first line of code, we specify the average of the previous 10 periods of the Close column values (inclusive of current) to be used in calculating a rolling mean. This makes use of the pandas’ DataFrame rolling method. Note the NaN value seen in the first 9 rows for the new SMA_10 column. These exist because there were not enough previous data to make a calculation.

Method 2: Using the pandas_ta Library

Pandas is a beast when it comes to scientific calculations. It offers a wide array of statistical and mathematical functions that can be used to calculate just about anything in a wildly efficient manner.

Pandas isn’t designed for calculating technical indicators and algorithmic traders may find its syntax cumbersome.  Fortunately, the pandas_ta library integrates with DataFrames natively and makes adding technical indicators a breeze. Consider the following code that adds the 5, 10, and 20-period Simple Moving Average calculated from the daily closing price:

# import required library
import pandas_ta as ta

# Add indicators, using data from before
df.ta.sma(close='close', length=5, append=True)
df.ta.sma(close='close', length=10, append=True)
df.ta.sma(close='close', length=20, append=True)

# View Result
                  open        high         low       close    volume  \
date                                                                   
2020-07-28  103.621322  103.698731  101.973248  102.035675  27163600   
2020-07-29  103.786129  105.039660  103.349140  104.532753  28450800   
2020-07-30  103.628814  106.105921  102.832245  106.016022  30888000   
2020-07-31  105.509108  107.539235  104.208132  106.023506  38608000   
2020-08-03  107.199633  110.857861  107.027333  109.973892  41272000   
...                ...         ...         ...         ...       ...   
2021-07-21  188.820007  195.270004  187.419998  194.100006  37101700   
2021-07-22  196.419998  198.869995  192.759995  195.940002  32382600   
2021-07-23  196.559998  197.000000  192.500000  195.580002  19542900   
2021-07-26  193.110001  194.419998  189.139999  192.940002  20373800   
2021-07-27  192.649994  196.199997  187.410004  192.080002  23994571   

                 SMA_5      SMA_10      SMA_20  
date                                            
2020-07-28         NaN         NaN         NaN  
2020-07-29         NaN         NaN         NaN  
2020-07-30         NaN         NaN         NaN  
2020-07-31         NaN         NaN         NaN  
2020-08-03  105.716370         NaN         NaN  
...                ...         ...         ...  
2021-07-21  187.858002  194.486000  196.781624  
2021-07-22  189.113501  194.177251  197.049999  
2021-07-23  191.907501  193.685001  197.226250  
2021-07-26  192.936002  192.466501  197.357750  
2021-07-27  194.128003  191.424501  196.969250  

Here we’ve used the pandas_ta native integration with Pandas DataFrames via the DataFrame.ta method. Using this allows us to easily add the simple moving averages via the pandas_ta sma function. Pay close attention to our use of the append=True argument. Without this, our newly-calculated indicator won’t be added to our existing DataFrame and be returned as a Pandas.core.Series object. This is pretty convenient but becomes a syntactic nightmare when adding lots of indicators. Fortunately, pandas_ta has a novel Strategy class to help facilitate more modular code. Consider the following:

# Create a pandas_ta strategy
moving_averages = ta.Strategy(
    name="SMA_5_10_20",
    ta=[
        {"kind": "sma", "length": 5},
        {"kind": "sma", "length": 10},
        {"kind": "sma", "length": 20}
    ]
)

# Disable multiprocessing
df.ta.cores = 0

# Add bulk indicators
df.ta.strategy(moving_averages, append=True)

                  open        high         low       close    volume  \
date                                                                   
2020-07-28  103.621322  103.698731  101.973248  102.035675  27163600   
2020-07-29  103.786129  105.039660  103.349140  104.532753  28450800   
2020-07-30  103.628814  106.105921  102.832245  106.016022  30888000   
2020-07-31  105.509108  107.539235  104.208132  106.023506  38608000   
2020-08-03  107.199633  110.857861  107.027333  109.973892  41272000   
...                ...         ...         ...         ...       ...   
2021-07-21  188.820007  195.270004  187.419998  194.100006  37101700   
2021-07-22  196.419998  198.869995  192.759995  195.940002  32382600   
2021-07-23  196.559998  197.000000  192.500000  195.580002  19542900   
2021-07-26  193.110001  194.419998  189.139999  192.940002  20373800   
2021-07-27  192.649994  196.199997  187.410004  192.080002  23994571   

                 SMA_5      SMA_10      SMA_20  
date                                            
2020-07-28         NaN         NaN         NaN  
2020-07-29         NaN         NaN         NaN  
2020-07-30         NaN         NaN         NaN  
2020-07-31         NaN         NaN         NaN  
2020-08-03  105.716370         NaN         NaN  
...                ...         ...         ...  
2021-07-21  187.858002  194.486000  196.781624  
2021-07-22  189.113501  194.177251  197.049999  
2021-07-23  191.907501  193.685001  197.226250  
2021-07-26  192.936002  192.466501  197.357750  
2021-07-27  194.128003  191.424501  196.969250  

[252 rows x 8 columns]

Here we see a very reusable approach at applying moving averages to DataFrames via the pandas_ta library. Take note of the df.ta.cores = 0 line. By default, pandas_ta will use multiprocessing to apply indicators in bulk.

This necessitates the calling of code via the if __name__ == "__main__" convention to support Windows systems. To leverage the power of multiprocessing for the addition of many indicators, simply call the function from within the main process:

if __name__ == '__main__':
    
    # Getting the data
    nvda = yf.Ticker('NVDA')
    df = nvda.history(period='1y')[['Open', 'High', 'Low', 'Close', 'Volume']]
    
    # Create a pandas_ta strategy
    moving_averages = ta.Strategy(
        name="SMA_5_10_20",
        ta=[
            {"kind": "sma", "length": 5},
            {"kind": "sma", "length": 10},
            {"kind": "sma", "length": 20}
        ]
    )
    
    # Add bulk indicators
    # df.ta.cores = 0
    df.ta.strategy(moving_averages, append=True)

When adding dozens of indicators it’s generally recommended to leverage the power of multiprocessing. This will speed up backtesting and data processing by orders of magnitude in many cases, especially in cases of more complex indicator calculations.

Plotting Moving Averages in Python

Making the calculations for moving averages is a breeze in Python—especially with the pandas_ta library. These data are perfect for integrating with larger trading strategies or developing custom machine learning models. However, DataFrames full of numbers don’t offer much in the way of visualization. Let’s take a look at how to create a visualization of some moving averages in Python using Pandas, pandas_ta, and Plotly.

import pandas_ta as ta
import yfinance as yf
import plotly.graph_objects as go


# Get the data
df = yf.Ticker('BTC-USD').history(period='6mo')[['Open', 'Close', 'High', 'Low', 'Volume']]

# Add the indicators
moving_averages = ta.Strategy(
    name="moving indicators",
    ta=[
        {"kind": "sma", "length": 10},
        {"kind": "ema", "length": 5},
    ]
)

# Disable multiprocessing, calculate averages
df.ta.cores = 0  # optional, but requires if __name__ == "__main__" syntax if not set to 0
df.ta.strategy(moving_averages)

# Create the Plot
fig = go.Figure(data=[
    go.Candlestick(
        x=df.index,
        open=df['open'],
        high=df['high'],
        low=df['low'],
        close=df['close'],
        increasing_line_color='#ff9900',
        decreasing_line_color='black',
        showlegend=False,
    ),
])

# Make it pretty
layout = go.Layout(
    plot_bgcolor='#efefef',
    # Font Families
    font_family='Monospace',
    font_color='#000000',
    font_size=20,
    xaxis=dict(
        rangeslider=dict(
            visible=False
        ))
)
fig.update_layout(layout)

# Display (in browser by default)
fig.show()

There are several things going on here.

  1. Using the yfinance library to get pricing data for $BTC-USD for the past 6 months.
  2. Adding two moving average indicators—a 10-period SMA and 5-period EMA.
  3. Creating a Candlestick class figure in Plotly
  4. Updating the chart options for aesthetic purposes
  5. Opening the result in the system default HTML viewer (Chrome, Firefox, Opera, etc.)

This produces the following results:

btc usd candlestick chart plotly python alpharithms notitle
Basic Candlestick chart showing the price of $BTC-USD over a period of 6 months. (click to enlarge)

This is pretty rad—but where are the moving averages! They aren’t visible because we forgot to add them via Plotly after calculating them via pandas_ta. Fortunately, adding extra “traces” to our figure is simple using the Plotly API. Consider the following example code:

...

# Add the SMA 10
fig.add_trace(
    go.Scatter(
        x=df.index,
        y=df['SMA_10'],
        line=dict(color='#ff9900', width=2),
        name='SMA_10'
    )
)

# Add the EMA 5
fig.add_trace(
    go.Scatter(
        x=df.index,
        y=df['EMA_5'],
        line=dict(color='#000', width=2),
        name='EMA_5'
    )
)

# See things again
fig.show()

This adds two “traces” to the Plotly figure—the SMA 10 and EMA 5 lines. Let’s see what we have now:

btc usd candlestick chart with moving averages plotly python alpharithms
6mo. Candlestick chart of $BTC-USD with a 10-period Simple Moving average and 5-period Exponential Moving Average. (Click to Enlarge)

In this visualization, we can see the 10-period simple moving average (yellow line) charting against the 5-period exponential moving average (black line.) Visualized together, we can see a shift in momentum in many areas where the EMA crosses the SMA. Given the EMA reflects a faster change in price momentum, this crossover pattern is often used as a composite technical indicator to forecast emerging price trends.

Review

Moving averages are great tools for forecasting momentum shifts in observed values over a period of time. We’ve seen how Python can easily facilitate the calculation and visualization of these technical indicators in such a way as to provide valuable (and actionable) insights. On their own, moving averages smooth out volatility to reflect momentum trends.

When used in tandem, or combined with other statistical methods, they can become even more powerful tools to forecast and predict future outcomes. For example, incorporating moving averages as features in linear regression modeling can provide a more robust predictive capability. Once one starts engineering composite features with multiple moving averages the sky may be the only limit for improvements in predictive accuracy!

Zack West
Entrepreneur, programmer, designer, and lifelong learner. Can be found taking notes from Mother Nature when not hammering away at the keyboard.