Moving averages are momentum indicators used in a range of fields from natural sciences to stock market trading. These calculations measure momentum in observed values over a period of time. For example, the simple moving average can help signal trend reversals in the stock market.
Calculating the moving average in Python is simple enough and can be done via custom functions, a mixture of standard library functions, or via powerful third-party libraries such as Pandas. In this article, we’ll take a look at how to calculate some common moving averages in Python as well as how to chart them out using Plotly.
Highlights
- Getting historical pricing data to use for visualization via the
yfinance
library - Two methods of calculating moving averages in Python
- Using
pandas_ta
library to create groupings of technical indicators to apply at broader scales - Considerations for multiprocessing of large amounts of data and indicators like moving averages
- Creating candlestick charts in
Plotly
with overlaid signal lines for simple and exponential moving averages - Interpreting the SMA and EMA crossover events as trading signals to indicate shifts in price momentum
Moving Averages 101
Before we get into how to calculate moving averages in Python let us first discuss what they are. Moving averages are measures of momentum over a series of observed values. These measures are commonly made across a subset of values within a larger set. This subset, known as the lookback period, offers different functional insights based on its value.
There are a number of common moving average indicators each of which has variations utilizing several common lookback periods. The following are the most common among stock-trading:
- Simple Moving Average (SMA): Represents the mean value across a period of n-previous observations. Common lookback periods include 50, 100, and 200-period trailing values.
- Weighted Moving Average (WMA): Represents a weighted mean across a period of n-pervious observations where each observation is given a different weight. Used as the basis for several other moving averages.
- Exponential Moving Average (EMA): Represents a weighted mean across a period of n-previous observations where values closest to the most recent are given exponentially larger consideration.
Moving averages not only come in a range of different lookback window variations but can also be used in conjunction with other statistical methods. For example, technical analysts use the Bollinger Bands which incorporate a simple moving average that integrates a 9-21 day lookback window period. Check out our article on Moving Averages for more information.
Calculating Moving Averages in Python
Python has emerged as the leading programming language for all things data. This includes machine learning, statistics (sorry, R), and algorithmic trading. As with any language, Python can use native syntax to calculate moving averages.
These implementations can be tedious, under-optimized, and hard to scale across large datasets. Fortunately, libraries like Pandas make implementing technical indicators a breeze. Let’s take a look at two approaches, both using Pandas:
Method 1: DataFrames & Native Pandas Functions
Pandas is a powerful computing library. It comes with a lot of optimized functions to absolutely churn through data. It doesn’t offer explicit support for some more complex indicators but moving averages are well within its ability. Consider the following approach for calculating the simple moving average using Pandas:
# import yfinance to get pricing data import yfinance as yf # Get 1-yr price history for $NVDA nvda = yf.Ticker('NVDA') df = nvda.history(period='1y')[['Open', 'High', 'Low', 'Close', 'Volume']] # return is pandas DataFrame object # Result Open High Low Close Volume Date 2020-07-28 103.621322 103.698731 101.973248 102.035675 27163600 2020-07-29 103.786129 105.039660 103.349140 104.532753 28450800 2020-07-30 103.628814 106.105921 102.832245 106.016022 30888000 2020-07-31 105.509108 107.539235 104.208132 106.023506 38608000 2020-08-03 107.199633 110.857861 107.027333 109.973892 41272000 ... ... ... ... ... ... 2021-07-21 188.820007 195.270004 187.419998 194.100006 37101700 2021-07-22 196.419998 198.869995 192.759995 195.940002 32382600 2021-07-23 196.559998 197.000000 192.500000 195.580002 19542900 2021-07-26 193.110001 194.419998 189.139999 192.940002 20373800 2021-07-27 192.649994 196.199997 187.410004 192.080002 23994571 [252 rows x 5 columns]
What we’ve done here is use the yfinance library to download historic pricing data from the finance.yahoo.com public API server. The is returned as a pandas DataFrame object. Note that only the OHLCV columns were specified. By default, a splits
and dividends
column are also returned. Let’s take a look at how to apply a simple moving average to this data:
# Add a simple moving average df['SMA_10'] = df['Close'].rolling(window=10).mean() # print the first 15 rows of data print(df.head(15)) Open High ... Volume SMA_10 Date ... 2020-07-28 103.621322 103.698731 ... 27163600 NaN 2020-07-29 103.786129 105.039660 ... 28450800 NaN 2020-07-30 103.628814 106.105921 ... 30888000 NaN 2020-07-31 105.509108 107.539235 ... 38608000 NaN 2020-08-03 107.199633 110.857861 ... 41272000 NaN 2020-08-04 110.370919 112.146339 ... 31033600 NaN 2020-08-05 112.308666 113.584670 ... 24992400 NaN 2020-08-06 113.364923 113.447331 ... 24431600 NaN 2020-08-07 112.992859 114.913114 ... 34251600 NaN 2020-08-10 113.210097 113.949231 ... 42779600 109.007021 2020-08-11 110.608158 111.237419 ... 35451200 109.640780 2020-08-12 109.779123 114.536057 ... 46441200 110.614391 2020-08-13 115.325137 117.080583 ... 37446000 111.442423 2020-08-14 115.165314 116.910770 ... 36643600 112.390564 2020-08-17 118.374052 123.952535 ... 62130000 113.715763 [15 rows x 6 columns]
In the first line of code, we specify the average of the previous 10 periods of the Close
column values (inclusive of current) to be used in calculating a rolling mean. This makes use of the pandas’ DataFrame rolling method. Note the NaN
value seen in the first 9 rows for the new SMA_10
column. These exist because there were not enough previous data to make a calculation.
Method 2: Using the pandas_ta Library
Pandas is a beast when it comes to scientific calculations. It offers a wide array of statistical and mathematical functions that can be used to calculate just about anything in a wildly efficient manner.
Pandas isn’t designed for calculating technical indicators and algorithmic traders may find its syntax cumbersome. Fortunately, the pandas_ta library integrates with DataFrames
natively and makes adding technical indicators a breeze. Consider the following code that adds the 5, 10, and 20-period Simple Moving Average calculated from the daily closing price:
# import required library import pandas_ta as ta # Add indicators, using data from before df.ta.sma(close='close', length=5, append=True) df.ta.sma(close='close', length=10, append=True) df.ta.sma(close='close', length=20, append=True) # View Result open high low close volume \ date 2020-07-28 103.621322 103.698731 101.973248 102.035675 27163600 2020-07-29 103.786129 105.039660 103.349140 104.532753 28450800 2020-07-30 103.628814 106.105921 102.832245 106.016022 30888000 2020-07-31 105.509108 107.539235 104.208132 106.023506 38608000 2020-08-03 107.199633 110.857861 107.027333 109.973892 41272000 ... ... ... ... ... ... 2021-07-21 188.820007 195.270004 187.419998 194.100006 37101700 2021-07-22 196.419998 198.869995 192.759995 195.940002 32382600 2021-07-23 196.559998 197.000000 192.500000 195.580002 19542900 2021-07-26 193.110001 194.419998 189.139999 192.940002 20373800 2021-07-27 192.649994 196.199997 187.410004 192.080002 23994571 SMA_5 SMA_10 SMA_20 date 2020-07-28 NaN NaN NaN 2020-07-29 NaN NaN NaN 2020-07-30 NaN NaN NaN 2020-07-31 NaN NaN NaN 2020-08-03 105.716370 NaN NaN ... ... ... ... 2021-07-21 187.858002 194.486000 196.781624 2021-07-22 189.113501 194.177251 197.049999 2021-07-23 191.907501 193.685001 197.226250 2021-07-26 192.936002 192.466501 197.357750 2021-07-27 194.128003 191.424501 196.969250
Here we’ve used the pandas_ta native integration with Pandas DataFrames
via the DataFrame.ta method. Using this allows us to easily add the simple moving averages via the pandas_ta sma
function. Pay close attention to our use of the append=True
argument. Without this, our newly-calculated indicator won’t be added to our existing DataFrame
and be returned as a Pandas.core.Series
object. This is pretty convenient but becomes a syntactic nightmare when adding lots of indicators. Fortunately, pandas_ta has a novel Strategy class to help facilitate more modular code. Consider the following:
# Create a pandas_ta strategy moving_averages = ta.Strategy( name="SMA_5_10_20", ta=[ {"kind": "sma", "length": 5}, {"kind": "sma", "length": 10}, {"kind": "sma", "length": 20} ] ) # Disable multiprocessing df.ta.cores = 0 # Add bulk indicators df.ta.strategy(moving_averages, append=True) open high low close volume \ date 2020-07-28 103.621322 103.698731 101.973248 102.035675 27163600 2020-07-29 103.786129 105.039660 103.349140 104.532753 28450800 2020-07-30 103.628814 106.105921 102.832245 106.016022 30888000 2020-07-31 105.509108 107.539235 104.208132 106.023506 38608000 2020-08-03 107.199633 110.857861 107.027333 109.973892 41272000 ... ... ... ... ... ... 2021-07-21 188.820007 195.270004 187.419998 194.100006 37101700 2021-07-22 196.419998 198.869995 192.759995 195.940002 32382600 2021-07-23 196.559998 197.000000 192.500000 195.580002 19542900 2021-07-26 193.110001 194.419998 189.139999 192.940002 20373800 2021-07-27 192.649994 196.199997 187.410004 192.080002 23994571 SMA_5 SMA_10 SMA_20 date 2020-07-28 NaN NaN NaN 2020-07-29 NaN NaN NaN 2020-07-30 NaN NaN NaN 2020-07-31 NaN NaN NaN 2020-08-03 105.716370 NaN NaN ... ... ... ... 2021-07-21 187.858002 194.486000 196.781624 2021-07-22 189.113501 194.177251 197.049999 2021-07-23 191.907501 193.685001 197.226250 2021-07-26 192.936002 192.466501 197.357750 2021-07-27 194.128003 191.424501 196.969250 [252 rows x 8 columns]
Here we see a very reusable approach at applying moving averages to DataFrames via the pandas_ta library. Take note of the df.ta.cores = 0
line. By default, pandas_ta will use multiprocessing to apply indicators in bulk.
This necessitates the calling of code via the if __name__ == "__main__"
convention to support Windows systems. To leverage the power of multiprocessing for the addition of many indicators, simply call the function from within the main process:
if __name__ == '__main__': # Getting the data nvda = yf.Ticker('NVDA') df = nvda.history(period='1y')[['Open', 'High', 'Low', 'Close', 'Volume']] # Create a pandas_ta strategy moving_averages = ta.Strategy( name="SMA_5_10_20", ta=[ {"kind": "sma", "length": 5}, {"kind": "sma", "length": 10}, {"kind": "sma", "length": 20} ] ) # Add bulk indicators # df.ta.cores = 0 df.ta.strategy(moving_averages, append=True)
When adding dozens of indicators it’s generally recommended to leverage the power of multiprocessing. This will speed up backtesting and data processing by orders of magnitude in many cases, especially in cases of more complex indicator calculations.
Plotting Moving Averages in Python
Making the calculations for moving averages is a breeze in Python—especially with the pandas_ta library. These data are perfect for integrating with larger trading strategies or developing custom machine learning models. However, DataFrames full of numbers don’t offer much in the way of visualization. Let’s take a look at how to create a visualization of some moving averages in Python using Pandas, pandas_ta, and Plotly.
import pandas_ta as ta import yfinance as yf import plotly.graph_objects as go # Get the data df = yf.Ticker('BTC-USD').history(period='6mo')[['Open', 'Close', 'High', 'Low', 'Volume']] # Add the indicators moving_averages = ta.Strategy( name="moving indicators", ta=[ {"kind": "sma", "length": 10}, {"kind": "ema", "length": 5}, ] ) # Disable multiprocessing, calculate averages df.ta.cores = 0 # optional, but requires if __name__ == "__main__" syntax if not set to 0 df.ta.strategy(moving_averages) # Create the Plot fig = go.Figure(data=[ go.Candlestick( x=df.index, open=df['open'], high=df['high'], low=df['low'], close=df['close'], increasing_line_color='#ff9900', decreasing_line_color='black', showlegend=False, ), ]) # Make it pretty layout = go.Layout( plot_bgcolor='#efefef', # Font Families font_family='Monospace', font_color='#000000', font_size=20, xaxis=dict( rangeslider=dict( visible=False )) ) fig.update_layout(layout) # Display (in browser by default) fig.show()
There are several things going on here.
- Using the yfinance library to get pricing data for $BTC-USD for the past 6 months.
- Adding two moving average indicators—a 10-period SMA and 5-period EMA.
- Creating a Candlestick class figure in Plotly
- Updating the chart options for aesthetic purposes
- Opening the result in the system default HTML viewer (Chrome, Firefox, Opera, etc.)
This produces the following results:
This is pretty rad—but where are the moving averages! They aren’t visible because we forgot to add them via Plotly after calculating them via pandas_ta. Fortunately, adding extra “traces” to our figure is simple using the Plotly API. Consider the following example code:
... # Add the SMA 10 fig.add_trace( go.Scatter( x=df.index, y=df['SMA_10'], line=dict(color='#ff9900', width=2), name='SMA_10' ) ) # Add the EMA 5 fig.add_trace( go.Scatter( x=df.index, y=df['EMA_5'], line=dict(color='#000', width=2), name='EMA_5' ) ) # See things again fig.show()
This adds two “traces” to the Plotly figure—the SMA 10 and EMA 5 lines. Let’s see what we have now:
In this visualization, we can see the 10-period simple moving average (yellow line) charting against the 5-period exponential moving average (black line.) Visualized together, we can see a shift in momentum in many areas where the EMA crosses the SMA. Given the EMA reflects a faster change in price momentum, this crossover pattern is often used as a composite technical indicator to forecast emerging price trends.
Review
Moving averages are great tools for forecasting momentum shifts in observed values over a period of time. We’ve seen how Python can easily facilitate the calculation and visualization of these technical indicators in such a way as to provide valuable (and actionable) insights. On their own, moving averages smooth out volatility to reflect momentum trends.
When used in tandem, or combined with other statistical methods, they can become even more powerful tools to forecast and predict future outcomes. For example, incorporating moving averages as features in linear regression modeling can provide a more robust predictive capability. Once one starts engineering composite features with multiple moving averages the sky may be the only limit for improvements in predictive accuracy!