Tag: qgis

(Fr) Rencontres QGIS-fr – Avignon du 10 au 12 juin 2025

Sorry, this entry is only available in French.

Learn More

The security project for QGIS : pledge now !

The Security project for QGIS” is now public ! Pledge now !

The goal of this project is to mutualize funding to improve QGIS security to the highest levels.

Oslandia and other involved partners, especially OPENGIS.ch are OpenSource “pure players” and main contributors to QGIS. This project is an initiative by Oslandia and is endorsed by the QGIS.org association. We work closely with the community of developers, users and stakeholders of QGIS. This project involves QGIS core committers willing to advance QGIS security.

Global context

New regulations like NIS2 and CRA in Europe, as well as other international or local regulations will be activated within the next couple of years. They require software and software producers to improve their cybersecurity practices. OpenSource softwares, while usually having a special treatment, are concerned too. Estimated costs of CRA impact on an opensource project amounts to +30%.

As for QGIS, we consider that the project stays behind what would be sufficient to comply with these regulations. We also do not fulfill requirements coming from our end-users, in terms of overall software quality regarding security, processes in place to ensure trust in the supply chain, and overall security culture in the project.

We have been discussing this topic with clients having large deployments of QGIS and QGIS server, and they stressed the issue, stating that cybersecurity is one of their primary concerns, and that they are willing to see the QGIS project move forward in this area as soon as possible. QGIS faces the risk of IT departments blocking QGIS installations if they consider the project not having enough consideration for security.

Also, requests to security@qgis.org have grown significantly.

Project goals

Oslandia, with other partners and backed by clients and end-users, launch the “Security project for QGIS” : we identified key topics where security improvements can be achieved, classified them, and created work packages to work on, with budget estimations.

  • The main goal is simple : raise the cybersecurity level for the QGIS project
  • Fulfill cybersecurity requirements from regulations and end-users
  • Make QGIS an example of security-aware OpenSource project, helping other OSGeo projects to improve

While QGIS and QGIS server are the main components on which this project focus, improving QGIS security as a whole also needs to consider underlying libraries ( e.g. GDAL/OGR, PROJ, GEOS…).

This project is a specific effort to raise the level of security of QGIS. Maintaining security in the long term will need further efforts, and we encourage you to sponsor QGIS.org, becoming a sustaining member of QGIS.

Memory safety, signing binaries, supply chain management, contributing processes, plugin security, cybersecurity audits and much more topics are included in this project. You can see all items as well as work packages on the dedicated website :

https://security.qgis.oslandia.com

Project organization – Pledge !

Any organization interested in improving QGIS security can contribute to funding the project. We are looking for an estimated total amount of 670K€, divided into 3 work packages ➡ Pledge now !

Once funded, Oslandia and partners will start working on Work Package 1 in 2025. We intend to work closely with the QGIS community, QGIS.org, interested partners and users. Part of the work are improvements over the current system, other require changes to processes or developer’s habits. Working closely with the user and developer’s community to raise our security awareness is fully part of the project.

We will deliver improvements in 2025 and until 2027. You can see the full list of topics, work packages and estimated budget on the project’s dedicated page : security.qgis.oslandia.com . You are invited to participate, but also to help spread the word and recruit other contributors !

We want to especially thank Orange France for being a long-time supporter of OpenSource in general and QGIS particularly, and the first backer of the Security Project for QGIS !

Should you have any question, or need further material to convince other stakeholders, get in touch !

Learn More

OSM Data : des données SIG jusqu'au serveur cartographique

OSM DATA 3D : mécanismes d'ingestion de données jusqu'à leur diffusion en flux WFS/WMS.
Learn More

Analyzing GTFS Realtime Data for Public Transport Insights

In today’s post, we (that is, Gaspard Merten from Universite Libre de Bruxelles and yours truly) are going to dive deep into how to analyze public transport data, using both schedule and real time information. This collaboration has been made possible by the EMERALDS project.

Previously, I already shared news about GTFS algorithms for Trajectools that add GTFS preprocessing tools (incl. Route, segment, and stop layer extraction) to the QGIS Processing toolbox. 

Today, we’ll discuss the aspect of handling realtime GTFS data and how we approach analytics that combine both data sources.

About Realtime GTFS 

Many of us have come to rely on real-time public transport updates in apps like Google Maps. These apps are powered by standardized data formats that ensure different systems can communicate. Google first introduced GTFS in 2005, a format designed to organize transit schedules, stop locations, and other static transit information. Then, in 2011, they introduced GTFS Realtime (GTFS-RT), which added the capability to include live updates on vehicle positions, delays, speeds, and much more.

However, as the name suggests, GTFS Realtime is all about live data. This means that while GTFS-RT APIs are useful for providing real-time insights,  they don’t hold historical data for analytics. Moreover, most transit agencies don’t keep past GTFS-RT records, and even fewer make them available to the public. This can be a significant challenge for anyone looking to analyze past trends and extract valuable insights from the data. For this reason, we had to implement our own solution to efficiently archive GTFS-RT files while making sure the files could be queried easily.

There are two main challenges in the implementation of such a solution:

  • Data Volume: While individual GTFS-RT files are relatively small—typically ranging from 50KB to 500KB depending on the public transport network size—the challenge lies in ingestion frequency. With an average file size of 100KB and updates every 5 seconds, a full day’s worth of data quickly scales up to 1.728GB.
  • Data Usability: GTFS-RT is a deeply nested format based on Protobuf, making direct conversion into a more accessible structure like a DataFrame difficult. Efficiently unnesting the data without losing critical details would significantly improve usability and streamline analysis.

Parquet to the Rescue

Storing and analyzing real-time transit data efficiently isn’t just about saving space—it’s about making the data easy to work with. Luckily, modern data formats have come a long way, allowing us to store massive amounts of data while keeping retrieval and analytics processing fast. One of the best tools for the job is Apache Parquet, a columnar storage format originally designed for Hadoop but now widely adopted in data science. With built-in support in libraries like Polars and Pandas, it’s become a go-to choice for handling large datasets efficiently. Moreover, Parquet can be converted to GeoParquet for smoother integration with GIS such as GeoPandas.

What makes Parquet particularly well-suited for GTFS Realtime data is the way it compresses columnar data. It leverages multiple compression algorithms and encodings, significantly reducing file sizes while keeping access speeds high. However, to get the most out of Parquet’s compression, we need to be smart about how we structure our data. Simply converting each GTFS-RT file into its own Parquet file might give us around 60% compression, which is decent. But if we group all GTFS-RT records for an entire hour into a single file, we can push that number up to 95%. The reason? A lot of transit data—like trip IDs and stop locations—doesn’t change much within an hour, while other values, such as coordinates, often share common elements. By organizing data in larger batches, we allow Parquet’s compression algorithms to work their magic, drastically reducing storage needs. And with a smaller disk footprint, retrieval is faster, making the entire analytics pipeline more efficient.

One more challenge to tackle is the structure of the data itself. GTFS-RT files tend to be highly nested, which isn’t an issue for Parquet but can be problematic for most data science tools. While Parquet technically supports nested structures, many analytical frameworks don’t handle them well. To fix this, we apply a lightweight preprocessing step to “unnest” the data. In the original GTFS-RT format, the vehicle position feed is deeply nested, making it difficult to work with. But once unnesting is applied, the structure becomes flat, with clear column names derived from the original hierarchy. This makes it easy to convert the data into a table format, ensuring smooth integration with tools commonly used by data scientists.

The GTFS-RT Pipelines

With this in mind, let’s walk through the two pipelines we built to store and retrieve GTFS-RT data efficiently.

The entire system relies on two key pipelines that work together. The first pipeline fetches GTFS-RT data from an API every five seconds, processes it, and stores it in an S3 bucket. The second pipeline runs hourly, gathering all the individual files from the past hour, merging them into a single Parquet file, and saving it back to the bucket in a structured format. We will now take a look at each pipeline in more detail.

Pipeline 1: Fetching and Storing Data

The first step in the process is retrieving GTFS-RT data. This is done via an API, which returns files in the Protocol Buffer (ProtoBuf) format. Fortunately, Google provides libraries (such as gtfs-realtime-bindings) that make it easy to parse ProtoBuf and convert it into a more accessible format like JSON. 

Once we have the data in JSON format, we need to split it based on entity type. GTFS-RT files contain different types of data, such as TripUpdate, which provides updated arrival times for stops, and VehiclePosition, which tracks real-time locations and speeds. Not all GTFS-RT feeds contain every entity type, but TripUpdate and VehiclePosition are the most commonly used. The full list of entity types can be found in the GTFS Realtime documentation.

We separate entity types because they have different schemas, making it difficult to store them in a single Parquet file. Keeping each entity type separate not only improves organization but also enhances compression efficiency. Once split, we apply the same unnesting process as described earlier, ensuring the data is structured in a way that’s easy to analyze. After that, we convert the data into a data frame and store it as a Parquet file in memory before uploading it to an S3 bucket. The files follow a structured naming convention like this:

{feed_type}/YYYY-MM-DD/hour/individual_{date-isoformat}.parquet

This format makes it easy to navigate the storage bucket manually while also ensuring seamless integration with the second pipeline.

Pipeline 2: Merging and Optimizing Storage

The second pipeline’s job is to take all the small Parquet files generated by Pipeline 1 and merge them into a single, optimized file per hour. To do this, it scans the storage bucket for the earliest unprocessed “hour folder” and begins processing from there. This design ensures that if the pipeline is temporarily interrupted, it can easily resume without skipping any data.

Once it identifies the files to merge, the pipeline loads them, assigns a proper timestamp to each record, and concatenates them into a single Parquet table. The final file is then uploaded to the S3 bucket using the following naming convention:

{feed_type}/YYYY-MM-DD/hour/HH.parquet

If any files fail to merge, they are renamed with the prefix unmerged_{date-isoformat}.parquet for manual inspection. After successfully storing the merged file, the pipeline deletes the individual files to keep storage clean and avoid unnecessary clutter.

One critical advantage of converting GTFS-RT data into Parquet early in the process is that it prevents memory overload. If we had to merge raw GTFS-RT files instead of pre-converted Parquet files, we would likely run into memory constraints, especially on standard servers with limited RAM. This makes Parquet not just a storage solution but an enabler of efficient large-scale processing.

Ready for Analytics

In this section, we will explore how to use the GTFS-RT data for public transport analytics. Specifically, we want to compute delays, that is, the difference between the scheduled travel time and the real travel time. 

The previously created Parquet files can be loaded into QGIS as tables without geometries. To turn them into point layers, we use the “Create points layer from table” algorithm from the Processing “Vector creation” toolbox. And once we convert the unixtimes to datetimes (using the datetime_from_epoch function), we have a point layer that is ready for use in Trajectools. 

Let’s have a look at one bus route. Bus 3 is one of the busiest routes in Riga. We apply a filter to the point layer which reveals the location of the route. 

Computing segment travel times

Computing travel times on public transport segments, i.e. between two scheduled stops, comes with a couple of challenges:

  1. The GTFS-RT location updates are provided in a rather sparse fashion with irregular reporting intervals. It is not clear that we “see” every stop that happens. 
  2. We cannot rely solely on stop detection since, sometimes, a vehicle will not come to a halt at scheduled stop locations (if nobody wants to get off or on)
  3. The stop ID, representing the next stop the vehicle will visit, is not always exact. Updates are often delayed and happen some time after passing the stop. 

Here’s an example visualization of the stop ID information of a single trip of bus 3, overlaid on top of the GTFS route and stops (in red):

To compute the desired delays, we decided to compare GTFS-RT travel times based on stop ID info with the scheduled travel times. To get the GTFS-RT travel times, we use Trajectools and create trajectories by splitting at stop ID change using the Split by value change algorithm:

Computing delays

The final step is to compute travel time differences between schedule and real time. For this, we implemented a SQL join that matches GTFS-RT trajectories with the corresponding entry in the GTFS schedule using route information and temporal information: 

The temporal information is important since the schedule accounts for different travel times during peak hours and off peak: 

This information is extracted from the GTFS schedule using the Trajectools Extract segments algorithm, if we chose the “Add scheduled speeds” option:

This will add the time windows, speeds, and runtimes per segment to the resulting segment layer: 

Joining the GTFS-RT trajectories with the scheduled segment information, we compute delays for every segment and trip. For example, here are the resulting delays for trip ‘AUTO3-18-1-240501-ab-2230’: 

Red lines mark segments where time is lost compared to the schedule, while blue lines indicate that the vehicle traversed the segment faster than the schedule suggested.

What’s next

When interpreting the results, it is important to acknowledge the effects caused by the timing of the next stop ID updates in the real-time GTFS feed. Sometimes, these updates come very late and thus introduce distortions where one segment’s travel time gets too long and the other too short. 

We will continue refining the analytics and related libraries, including the QGIS Trajectools plugin, to facilitate analytics of GTFS-RT & GTFS.

After successful testing of this analytics approach in Riga, we aim to transfer it to other cities. But for this to work, public transport companies need ways to efficiently store their data and, ideally, to release them openly to allow for analysis.

The pipelines we described, help keep storage needs low, which allows us to drastically reduce costs (for a year we would only have a few gigabytes, which is inexpensive to store in S3 storage). Let us know if you would be interested in an online platform on which one could register a GTFS-RT feed & GTFS, which would then automatically start being archived (in exchange, the provider would only need to accept sharing the archives as open data, at no cost for them).

Learn More

Introducing the new QGIS Plugins Website and QGIS Hub

We are excited to announce the release of the newly updated QGIS Plugins Website and the launch of the QGIS Hub Website! These updates bring a fresh new look that aligns with the QGIS branding overhaul, along with significant improvements in user experience.


Revamped QGIS Plugins Website

QGIS Plugins Homepage


The QGIS Plugins website has undergone a major redesign to enhance usability and provide a seamless experience for users. With a modernised interface and improved navigation, users can now find and manage plugins more efficiently. Some of the key updates include:

  1. A fresh UI that matches the latest QGIS branding.
  2. Enhanced browsing experience with better categorisation of plugins.
  3. Detailed plugin pages showcasing ratings, download counts, and descriptions more clearly.
  4. Improved search and filtering options to find the right plugin quickly.
  5. A more intuitive submission process for plugin developers.


Plugins List with Grid View and a Table View


Plugins search and details page


Introducing QGIS Hub

QGIS Hub Homepage


In addition to the plugin update, we are thrilled to introduce the QGIS Hub, now available at https://hub.qgis.org. This new platform serves as a dedicated space for sharing QGIS resources such as styles, 3D models, geopackages projects files, QGIS Layer Definition (QLR) files, and much more. By separating this section into its own website, we have made it easier for users to discover and access valuable resources.


 Key features of the QGIS Hub include:

  1. A visually appealing homepage with featured resources.
  2. A well-organised list view for browsing available assets.
  3. Detailed resource pages with previews and descriptions.
  4. Advanced search functionality to quickly locate specific resources.
  5. A seamless submission process for users who wish to contribute their resources.


Resources list and details page


Thank you to the QGIS Community

QGIS is developed by a team of dedicated volunteers, companies and organisations. The QGIS project relies on sponsorships and donations for much of their funding. Without the contributions of the QGIS sustaining members and donors and all volunteers, these continued improvements would not be possible. At Kartoza we are fortunate to employ both a QGIS Document Writer and QGIS Developer as fulltime staff members, an achievement made possible through the donations from QGIS community. Thank you to Tim Sutton (member of QGIS Steering Committee) for donating his time and helping make these updates possible.


Experience the New Platforms Today!

We invite you to explore the new QGIS Plugin website and the QGIS Hub today. These updates are designed to enhance your workflow and make it easier to extend and enrich your QGIS experience. We look forward to your feedback and continued support as we work to improve the QGIS ecosystem!

Learn More

OSM DATA 3D : présentation

Cet article présente la plateforme OSM DATA et sa nouvelle version en 3D
Learn More

Upcoming local FOSS4G in Bulgaria on 7th-8th March

FOSS4G:BG: Open GIS conference is coming early in March as a local FOSS4G event in Bulgaria organized by the QGIS.bg community. The event will span in two days, having a day with workshops with deep dive in different topics and a second day with conference presentations.
Learn More

L'enjeu de la data au département du Gard

Comment le département du Gard valorise son patrimoine de données classiques et de géo-données au travers de différents outils numériques.
Learn More

Revue de presse du 21 février 2025

Rejoignez le côté QGIS de la Force au travers de cette RDP teintée de sabres géolaser, de mviewer, de paquets python de la MAIF et de l'INSEE, d'ipv6, de PostGIS qui bruissent et qui barrissent... Avec de la contribution externe que la Geotribu est ravie d'accueillir!
Learn More

Installing QGIS on Ubuntu with apt

Installing the most widely used open-source GIS software on the most popular Linux distribution should be straightforward, yet it often raises questions and even problems. This guide walks you through the process so you can refer back to it whenever needed.
Learn More