Plotly Fundamentals -
Map Plots
In this tutorial section we will look into map plots. Typically, producing charts with a map in the background involves rather inconvenient hacks and having to inject some third party software via API key. Plotly comes with numerous features that make all that obsolete. Besides our typical import of the Pandas library we will only need the Plotly Express interface.
import pandas as pd
import plotly.express as px
We will use the gapminder dataset which became famous due to a TED talk on statistical facts related to socio economic development of countries around the world. Google it in case you have not seen it – it’s awesome, thank me later.
geo = px.data.gapminder().query("year==2007")
geo.head()
country | continent | year | lifeExp | pop | gdpPercap | iso_alpha | iso_num | |
---|---|---|---|---|---|---|---|---|
11 | Afghanistan | Asia | 2007 | 43.828 | 31889923 | 974.580338 | AFG | 4 |
23 | Albania | Europe | 2007 | 76.423 | 3600523 | 5937.029526 | ALB | 8 |
35 | Algeria | Africa | 2007 | 72.301 | 33333216 | 6223.367465 | DZA | 12 |
47 | Angola | Africa | 2007 | 42.731 | 12420476 | 4797.231267 | AGO | 24 |
59 | Argentina | Americas | 2007 | 75.320 | 40301927 | 12779.379640 | ARG | 32 |
The two types of map plots we will go over are bubble map plots and so called choropleth map plots. The former is rather straight forward while the latter requires some additional work upfront. More on that later. As usual, we start by creating a figure object using the scatter_geo function to which we pass a slice of the gapminder dataset and the column name of the values we want to display in the size parameter. Similar to other express plots we can provide some additional inputs such as color to add more dimensions to your plot. At the same time, there are additional inputs needed. First we need to provide a value for locations which indicates which column of your dataset contains the geographical data you want to use. Second, locationmode tells you how the values in your location field will be mapped to geodata on the plot. In this example we are using ‘ISO-3’ which is a standardised format for country codes. Third, the projection parameter gives you control over the appearance of your map (check out the official plotly documentation for a comprehensive overview).
fig = px.scatter_geo(geo, locations="iso_alpha",locationmode = 'ISO-3', color="continent",
hover_name="country", size="pop",
projection="natural earth", height=800)
fig.show()
Maybe you want to display your data only for a specific region. That you can achieve by playing with the scope parameter. It has values for every major region of the world such as Europe, Africa, South America and so on.
geo_europe = geo[geo['continent']== 'Europe']
fig = px.scatter_geo(geo_europe, locationmode="country names", locations = 'country', color="country",
hover_name="country", size="gdpPercap", scope = 'europe',
projection="natural earth", height=800)
fig.show()
As cool as that is, it is also rather restrictive. There is a much more flexible way to display data indexed by geographic regions called choropleth plots. Such plots are made up of polygons whose corners are provided as part of the Plotly function call. You can then color those polygons based on variations in the data you provide. Obviously, those coordinates need to come from somewhere. Fortunately, for most regions in the world so called Geo JSON files can be downloaded for free. To demonstrate, we will use a Geo JSON file which contains the outline of German counties in form of latitude/longitude values. Using the urllib as well as the json package we can print how our Geo JSON is structured. Most importantly, Plotly choropleth plots require Geo JSON files that are index by a unique attribute in the JSON file. In our example this is given by the ‘id’. We will need that later to join the location data with the data values we want to display.
from urllib.request import urlopen
import json
with urlopen('https://raw.githubusercontent.com/isellsoap/deutschlandGeoJSON/main/2_bundeslaender/4_niedrig.geo.json') as response:
counties = json.load(response)
counties["features"][0]
{'type': 'Feature',
'id': 0,
'properties': {'id': 'DE-BW', 'name': 'Baden-Württemberg', 'type': 'State'},
'geometry': {'type': 'MultiPolygon',
'coordinates': [[[[9.650460243225211, 49.7763404846192],
[9.637469291687069, 49.69042205810558],
[9.671889305114973, 49.683181762695426],
[9.715788841247615, 49.724449157714844],
[9.73471069335966, 49.683601379394645],
[9.795839309692326, 49.72497940063482],
[9.836868286132812, 49.69924926757807],
[9.83308029174799, 49.65090179443382],
[9.880538940429688, 49.60306167602539],
[9.825579643249796, 49.550659179687614],
[9.865819931030273, 49.53993988037121],
[9.921249389648722, 49.58118057250982],
[9.90511894226097, 49.5551109313966],
[9.933589935302848, 49.55537033081055],
[9.94201946258545, 49.47780990600609],
[10.021670341491813, 49.47856903076166],
[10.070828437805176, 49.54175186157238],
[10.089150428772086, 49.504989624023665],
[10.129039764404354, 49.50531005859375],
[10.102548599243107, 49.442440032958984],
[10.17243862152128, 49.39152908325195],
[10.115939140319995, 49.376281738281364]
...
Besides our Geo JSON file we will need some data. Here I downloaded unemployment data from a German government website and saved it to my machine. It is important that your dataset is indexed using the same unique keys that are also used in the Geo JSON file (in our case ‘id’). Let’s have a look at the data using pandas head function.
unemp = pd.read_csv("unemployment.csv",dtype={"id": str}, delimiter =';')
unemp.head()
county | id | unemp | |
---|---|---|---|
0 | Baden-Württemberg | 0 | 3.9 |
1 | Bayern | 1 | 3.5 |
2 | Berlin | 2 | 9.8 |
3 | Brandenburg | 3 | 5.9 |
4 | Bremen | 4 | 10.7 |
With all the ingredients in place, we can now use the choropleth_mapbox function to draw our map and color the polygons. Besides the data and Geo JSON files we provide some additional function input for stylistic purposes. Noteworthy in that regard is on the one hand to alway center your map on the country or region by providing latitude/longitude values to center and play with the zoom parameter to avoid viewers having to do that manually with the charts controls. Plotly provides a variety of different Mapbox styles free of charge without any additional keys needed. Those styles let tweak how the background map looks like. The plotly documentation lists various free mapbox_styles but you can also use mapbox to create your own style even though that might not come for free.
fig = px.choropleth_mapbox(unemp, geojson=counties, locations='id', color='unemp',
color_continuous_scale="brwnyl",
range_color=(3, 11),
mapbox_style="white-bg",
zoom=5, center = {"lat": 51.179343, "lon": 10.173340},
opacity=0.75,
labels={'unemp':'unemployment rate'},
)
fig.update_layout(height = 600, margin={"r":0,"t":0,"l":0,"b":0})
fig.show()
With map plots in the books we have covered all of the most common chart types. I highly recommend checking out the Plotly chars gallery for a ton of additional examples. While the charts we have explored are all responsive per default, you might want to customise how those elements behave. This is achieved with Plotly controls that will be covered in the next chapter.