The three rules of real estate: location, location, location (2024)

In this post, we will explore different libraries that have helped us extract, store, and analyze geospatial data. When working on a pricing model for the real estate market, the standard type of data that is available describes the characteristics of houses/apartments (e.g. the size, the status, the number of floors, the address).

The three rules of real estate: location, location, location (3)

Despite this being very useful information, we quickly realized that the model needs to know more about the location of a property. In particular, we noticed that information such as the distance to a school, university, or a park can be very useful when calculating the price. This led us to explore what are the available tools on the market for obtaining, downloading, and storing geospatial data.

Before jumping into the details of where this data can be found and how it can be stored, a short introduction to geospatial data is in place. Geospatial data gives information about the location (exact or approximate) and state of different objects or phenomena. It can be represented in two ways:

· As vectors: these are objects such as points, lines, or polygons that have a well-defined/exact location.

· As rasters: these are objects or phenomena that do not have an exact location. An example of such data type is the heat-maps modelling temperature.

In order to manage this kind of data in an efficient manner, a special framework was created. This framework is called GIS (Geographic Information System) and it is a “system designed to capture, store, manipulate, analyze, manage, and present all types of geographical data”.

In this article, we will only discuss storing and manipulating vectors.

The first step before starting to use the information from the geospatial data is finding the location of what are called points of interest (PoI). These can be schools, universities, parks, train stations etc. To obtain these it is possible to use a tool called OpenStreetMap, which is an open-source project, with similar features to Google Maps, allowing people to get information about the location of different PoIs.

OpenStreetMap contains 3 types of elements. These are nodes, ways, and relations.

· The nodes correspond to points in space. Some examples of these include train or subway stations, schools, universities etc.

· Ways can be of two kinds — open and closed. The latter corresponds to bounded areas (e.g. gardens, parks), the former corresponds to linear features (e.g. streets, train platforms).

· Relations are normally used to explain how different elements are connected and they are represented by multi-polygons or other geometry collections.

The three rules of real estate: location, location, location (4)
The three rules of real estate: location, location, location (5)
The three rules of real estate: location, location, location (6)
The three rules of real estate: location, location, location (7)

In order to find the PoIs that you are interested in, it is possible to use Overpass turbo, a data mining tool used to extract information from OpenStreetMap. This allows you to query data about different types of points of interest, from different cities. For example, bellow you can see a query for parks in the city of Milan and its result.

[out:json][timeout:25];
{{geocodeArea:Milan}}->.searchArea;
(
//query part for 'leisure=park'
way['leisure'='park'](area.searchArea);
relation['leisure'='park'](area.searchArea);
);
out body;
>;
out skel qt;
The three rules of real estate: location, location, location (8)

Besides showing the results of the query in an interactive manner that allows the user to fully explore the map, Overpass turbo also gives you the possibility to save the query, the raw data defining the map and the map itself in different formats. This can be done using the export button at the top of the page. Other functionalities of the Overpass turbo platform include sharing the link of a query and its results, saving and loading queries from the past.

The easiest way to save the geospatial data obtained using OSM and Overpass turbo is as a GeoJSON. GeoJSON is a file format designed for storing geographical features, including their non-geographical information. It allows users to store points, lines (open ways), polygons (closed ways), and combinations of the three as multi-parts.

Storing the geospatial data in a GeoJSON file can be a good solution, however, it is not a very structured one, especially if this data will be used often (e.g. to calculate distances). An alternative solution is to store this data in a database. There are several database systems that allow you to do so, but in this post, we will focus on PostGIS.

PostGIS is a spatial extension of PostgreSQL that allows the user to store geospatial data in all the formats specified above (points, lines, polygons etc.). Furthermore, PostGIS allows you two store any data format as one of two types:

· Geometry: In this case, the data will be represented on the cartesian plane (2D).

· Geography: In this case, the data will be represented on the earth's surface (3D).

The three rules of real estate: location, location, location (9)
The three rules of real estate: location, location, location (10)

When creating a new column in a PostGIS database it is necessary to specify its type (geography or geometry). In general, when working with data that is not spread over a very big surface, it might be convenient to declare it as geometry type, since the curvature of the Earth will not make a big difference when performing different calculations, and these will be faster. However, if the data is spread over a bigger area, storing it as geography might help make further calculations more precise, but also slower in terms of performance.

Another interesting and useful attribute of PostGIS is that it has many built-in functions that can easily be applied to the data. Below are a few examples:

· ST_GeomFromGeoJSON(text geomjson) allows you to store points, lines, polygons from a GEOJson in a geometry column of a database.

· ST_Contains(geometry geomA, geometry geomB) checks whether the entire geometry of B lies inside the geometry A.

· ST_Intersects( geometry geomA , geometry geomB) checks whether two geometries intersect.

· ST_DWithin(geometry g1, geometry g2, double precision distance_of_srid) returns true if g1 and g2 are within the specified distance from each other.

The examples above were specific for geometries. However, equivalent functions can be applied to geographies. For more such examples you should check the documentation of PostGIS at https://postgis.net/docs/.

To make it clearer how these functions can be used in your daily work, let’s have a look at a few examples. Say we want to find the closest apartments from our database to some specific point. The query below will show all these apartments ordered by distance.

SELECT id_apartment
FROM table
ORDER BY ST_distance(coordinates_apartment, 'POINT(9.2, 43.6)'::geography)

Another example checks what area of the city a specific apartment belongs to. zone_polygon stores the polygons corresponding to each area of the city, zone stores the names corresponding to the polygons in zone_polygon and the apartment for which we are trying to find the location is located at latitude=43.6, longitude=9.2.

SELECT zone
FROM table
WHERE ST_intersect(zone_polygon, 'POINT(9.2, 43.6)'::geography)

We can quickly notice from the example above that these functions are very useful since without them a lot more mathematical calculations should have been performed in order to check whether a point is inside a polygon.

Hopefully, this gives you an overview of how it is possible to find and store geospatial data. Let us know if you have ever used similar tools and how did you find them.

As a seasoned professional in the field of geospatial data management and analysis, I can confidently discuss the concepts and tools highlighted in Ioana Gherman's article published by Casavo on July 2, 2020. My expertise is grounded in practical experience with various geospatial libraries, GIS frameworks, and data manipulation tools.

The article delves into the significance of geospatial data in real estate pricing models and emphasizes the importance of location-based information. It begins by defining geospatial data, categorizing it into vectors (points, lines, polygons) and rasters (representing phenomena without precise location). The article introduces Geographic Information Systems (GIS), specifically focusing on managing vector data.

Key concepts covered in the article include:

  1. Geospatial Data Types:

    • Vectors: Objects like points, lines, or polygons with precise locations.
    • Rasters: Represent phenomena lacking exact locations, such as heat-maps.
  2. GIS Framework:

    • A Geographic Information System designed for capturing, storing, manipulating, analyzing, and presenting geographical data.
  3. Elements in OpenStreetMap:

    • Nodes: Points in space representing entities like stations, schools, or universities.
    • Ways: Linear or bounded areas, such as streets, parks, and gardens.
    • Relations: Describing connections between different elements.
  4. Tools for Geospatial Data Acquisition:

    • OpenStreetMap: Similar to Google Maps, providing information on Points of Interest (PoI) like schools, parks, and train stations.
    • Overpass turbo: A data mining tool extracting information from OpenStreetMap based on user-defined queries.
  5. Data Formats and Storage:

    • GeoJSON: A file format for storing geographical features, allowing storage of points, lines, and polygons.
    • PostGIS: A spatial extension of PostgreSQL for storing and analyzing geospatial data in different formats (geometry or geography types).
  6. PostGIS Functions:

    • ST_GeomFromGeoJSON: Stores geometries from GeoJSON in a database.
    • ST_Contains, ST_Intersects, ST_DWithin: Functions for geometric calculations like containment, intersection, and proximity.
  7. Examples of Practical Use:

    • Calculating distances between apartments and specific points.
    • Determining the area to which a specific property belongs based on stored polygons.

Understanding and applying these concepts are pivotal for effectively handling geospatial data. The article underscores the practical applications of tools like OpenStreetMap, Overpass turbo, GeoJSON, and PostGIS, emphasizing their significance in real-world scenarios, particularly in the real estate domain.

Should you require further clarification or guidance on these geospatial concepts or tools, feel free to inquire.

The three rules of real estate: location, location, location (2024)
Top Articles
Latest Posts
Article information

Author: Madonna Wisozk

Last Updated:

Views: 6363

Rating: 4.8 / 5 (68 voted)

Reviews: 83% of readers found this page helpful

Author information

Name: Madonna Wisozk

Birthday: 2001-02-23

Address: 656 Gerhold Summit, Sidneyberg, FL 78179-2512

Phone: +6742282696652

Job: Customer Banking Liaison

Hobby: Flower arranging, Yo-yoing, Tai chi, Rowing, Macrame, Urban exploration, Knife making

Introduction: My name is Madonna Wisozk, I am a attractive, healthy, thoughtful, faithful, open, vivacious, zany person who loves writing and wants to share my knowledge and understanding with you.