Reprojecting Data#
This section introduces the concept of reprojection — converting data from one coordinate reference system (CRS) to another. We’ll see why reprojection is essential, how to estimate the correct UTM zone, and how to assign or transform CRS in practice using GeoPandas and OSMnx.
Import libraries#
import geopandas as gpd
import pandas as pd
import osmnx as ox
Export City Boundary#
admin_border = ox.geocode_to_gdf('Vienna, Austria')
admin_border.explore(tiles='cartodbpositron')
Estimating UTM#
When working with cities or local areas, it’s often important to reproject geographic data into a UTM zone.
Why? Because geographic coordinates (latitude and longitude) are not measured in consistent distances — one degree of longitude can mean very different lengths depending on how far you are from the equator.
UTM (Universal Transverse Mercator) zones use meters instead of degrees, which makes distance, area, and direction calculations much more accurate for local-scale analysis — like city planning, infrastructure, or urban mapping.
You can calculate a UTM zone yourself based on a point’s coordinates.
To find the zone number, you only need the longitude.
The latitude will tell you whether the point is in the northern or southern h
def utm_zone(longitude):
zone = int((longitude + 180) / 6) + 1
return zone
What this function does:
It calculates the UTM zone number based on a given longitude.
The Earth is divided into 60 UTM zones, each 6 degrees of longitude wide.
The formula
int((longitude + 180) / 6) + 1
shifts the longitude range from[-180, +180]
to[0, 360]
and then divides it by 6 to find out which zone it belongs to.The result is rounded down using
int(...)
and adjusted by+1
because UTM zones are numbered starting from 1 (not 0).
Let’s use our function to check the UTM zone number for a location at 30.33 degrees east longitude.
longitude = 30.33
zone = utm_zone(longitude)
print("UTM zone:", zone)
UTM zone: 36
Actually, the function above was just a fun way to practice writing code — there’s a simpler way to solve this task :)
If you’re working with a dataset, you can easily find the appropriate UTM zone using the .estimate_utm_crs()
method.
admin_border.estimate_utm_crs()
<Projected CRS: EPSG:32633>
Name: WGS 84 / UTM zone 33N
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- name: Between 12°E and 18°E, northern hemisphere between equator and 84°N, onshore and offshore. Austria. Bosnia and Herzegovina. Cameroon. Central African Republic. Chad. Congo. Croatia. Czechia. Democratic Republic of the Congo (Zaire). Gabon. Germany. Hungary. Italy. Libya. Malta. Niger. Nigeria. Norway. Poland. San Marino. Slovakia. Slovenia. Svalbard. Sweden. Vatican City State.
- bounds: (12.0, 0.0, 18.0, 84.0)
Coordinate Operation:
- name: UTM zone 33N
- method: Transverse Mercator
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
Reprojecting#
Reprojecting means converting spatial data from one coordinate system to another.
In practice, it means recalculating the coordinates of each point so they fit a different map projection.
But before reprojecting, we need to decide which coordinate system we want to convert to.
Thanks to the estimate_utm_crs()
method, we already know the correct CRS for our data.
Let’s save it to a separate variable so we can use it when reprojecting the dataset.
data_crs = admin_border.estimate_utm_crs()
To reproject the data in GeoPandas, we use the .to_crs()
method.
This function transforms the geometry of the dataset into a new coordinate reference system (CRS).
We simply pass the target CRS (for example, the one returned by estimate_utm_crs()
) inside the method.
It recalculates all coordinate values so that the data aligns correctly in the new projection.
admin_border_utm = admin_border.to_crs(data_crs)
print(admin_border_utm.crs)
EPSG:32633
We’ve successfully reprojected our layer into the appropriate UTM zone.
Now, our data is in meters instead of degrees, which makes it much more suitable for performing accurate spatial analysis — such as measuring distances, calculating areas, or working with city-scale infrastructure and planning tasks.
Setting the CRS#
For data without a CRS#
Sometimes we encounter spatial data that doesn’t have a defined coordinate reference system (CRS).
Let’s look at an example using a CSV file with theater locations in Saint Petersburg.
We’ll read the data and create a GeoDataFrame
from it:
poi = pd.read_csv('../data/top_locations_wien.csv', sep=";", decimal=',')
poi_gdf = gpd.GeoDataFrame(poi, geometry=gpd.points_from_xy(poi['geo_longitude'], poi['geo_latitude']))
Let’s check the coordinate reference system of poi_gdf
:
print(poi_gdf.crs)
None
The result is None
— that’s because we didn’t set a CRS when creating the GeoDataFrame
.
But sometimes, we simply receive data that comes without a defined CRS.
In such cases, we can assign a CRS using one of two approaches:
Option 1: Use the
.set_crs()
methodOption 2: Directly assign the CRS using the
.crs
attribute
poi_gdf = poi_gdf.set_crs(epsg=4326) # option 1
poi_gdf.crs = "EPSG:4326" # option 2
print(poi_gdf.crs)
EPSG:4326
Setting a CRS vs. Reprojecting#
Reprojecting means converting coordinates from one CRS to another — it changes both the CRS and the actual coordinate values of all geometries in the dataset.
In contrast, when we set a CRS, no coordinate transformation takes place. We’re simply assigning a CRS label to the existing data, telling GeoPandas how to interpret the coordinates that are already there.
⚠️ It’s important to know what CRS the coordinates are originally in.
If you assign the wrong CRS, your data might appear in the wrong part of the world — sometimes even on the wrong continent!
Choosing a Map Projection#
One of the most commonly used projections for GIS projects focused on urban or local-scale data is the Universal Transverse Mercator (UTM) projection.
It provides accurate distance and area measurements within each zone, making it ideal for city-level mapping and analysis.
For smaller scales — such as mapping all of Russia or the entire world — choosing the right projection becomes more complex.
It largely depends on the geographic extent of your data and the specific goals of your project.
A great resource for understanding map projections is the classic 1987 book
Map Projections – A Working Manual 📕.
It offers detailed explanations of different projection types, their characteristics, history, and practical use cases.
Summary#
In this section, we learned how to reproject spatial data into appropriate coordinate systems for accurate analysis.
We explored how to:
Estimate the correct UTM zone for a dataset using
.estimate_utm_crs()
Use
.to_crs()
to reproject data from geographic coordinates (degrees) into projected coordinates (meters)Differentiate between setting a CRS (assigning a label) and reprojecting (actually transforming the coordinates)
Correctly assign CRS information to datasets that come without it
By understanding reprojection, we can ensure that spatial data is measured in the right units and aligned properly — enabling precise calculations of distance, area, and direction.