skip to navigation
skip to content

Planet Python

Last update: March 31, 2025 04:42 AM UTC

March 29, 2025


Ned Batchelder

Human sorting improved

When sorting strings, you’d often like the order to make sense to a person. That means numbers need to be treated numerically even if they are in a larger string.

For example, sorting Python versions with the default sort() would give you:

Python 3.10
Python 3.11
Python 3.9

when you want it to be:

Python 3.9
Python 3.10
Python 3.11

I wrote about this long ago (Human sorting), but have continued to tweak the code and needed to add it to a project recently. Here’s the latest:

import re

def human_key(s: str) -> tuple[list[str | int], str]:
    """Turn a string into a sortable value that works how humans expect.

    "z23A" -> (["z", 23, "a"], "z23A")

    The original string is appended as a last value to ensure the
    key is unique enough so that "x1y" and "x001y" can be distinguished.

    """
    def try_int(s: str) -> str | int:
        """If `s` is a number, return an int, else `s` unchanged."""
        try:
            return int(s)
        except ValueError:
            return s

    return ([try_int(c) for c in re.split(r"(\d+)", s.casefold())], s)

def human_sort(strings: list[str]) -> None:
    """Sort a list of strings how humans expect."""
    strings.sort(key=human_key)

The central idea here is to turn a string like "Python 3.9" into the key ["Python ", 3, ".", 9] so that numeric components will be sorted by their numeric value. The re.split() function gives us interleaved words and numbers, and try_int() turns the numbers into actual numbers, giving us sortable key lists.

There are two improvements from the original:

If you are interested, there are many different ways to split the string into the word/number mix. The comments on the old post have many alternatives, and there are certainly more.

This still makes some assumptions about what is wanted, and doesn’t cover all possible options (floats? negative/positive? full file paths?). For those, you probably want the full-featured natsort (natural sort) package.

March 29, 2025 04:59 PM UTC


Python GUIs

PyQt6 Toolbars & Menus — QAction — Defining toolbars, menus, and keyboard shortcuts with QAction

Next, we'll look at some of the common user interface elements you've probably seen in many other applications — toolbars and menus. We'll also explore the neat system Qt provides for minimizing the duplication between different UI areas — QAction.

Basic App

We'll start this tutorial with a simple skeleton application, which we can customize. Save the following code in a file named app.py -- this code all the imports you'll need for the later steps:

python
from PyQt6.QtCore import QSize, Qt
from PyQt6.QtGui import QAction, QIcon, QKeySequence
from PyQt6.QtWidgets import (
    QApplication,
    QCheckBox,
    QLabel,
    QMainWindow,
    QStatusBar,
    QToolBar,
)

class MainWindow(QMainWindow):
    def __init__(self):
        super().__init__()
        self.setWindowTitle("My App")

app = QApplication([])
window = MainWindow()
window.show()
app.exec()

This file contains the imports and the basic code that you'll use to complete the examples in this tutorial.

If you're migrating to PyQt6 from PyQt5, notice that QAction is now available via the QtGui module.

Toolbars

One of the most commonly seen user interface elements is the toolbar. Toolbars are bars of icons and/or text used to perform common tasks within an application, for which access via a menu would be cumbersome. They are one of the most common UI features seen in many applications. While some complex applications, particularly in the Microsoft Office suite, have migrated to contextual 'ribbon' interfaces, the standard toolbar is usually sufficient for the majority of applications you will create.

Standard GUI elements Standard GUI elements

Adding a Toolbar

Let's start by adding a toolbar to our application.

In Qt, toolbars are created from the QToolBar class. To start, you create an instance of the class and then call addToolbar on the QMainWindow. Passing a string in as the first argument to QToolBar sets the toolbar's name, which will be used to identify the toolbar in the UI:

python
class MainWindow(QMainWindow):
    def __init__(self):
        super().__init__()
        self.setWindowTitle("My App")

        label = QLabel("Hello!")
        label.setAlignment(Qt.AlignmentFlag.AlignCenter)

        self.setCentralWidget(label)

        toolbar = QToolBar("My main toolbar")
        self.addToolBar(toolbar)

    def onMyToolBarButtonClick(self, s):
        print("click", s)

Run it! You'll see a thin grey bar at the top of the window. This is your toolbar. Right-click the name to trigger a context menu and toggle the bar off.

A window with a toolbar. A window with a toolbar.

How can I get my toolbar back? Unfortunately, once you remove a toolbar, there is now no place to right-click to re-add it. So, as a general rule, you want to either keep one toolbar un-removeable, or provide an alternative interface in the menus to turn toolbars on and off.

We should make the toolbar a bit more interesting. We could just add a QButton widget, but there is a better approach in Qt that gets you some additional features — and that is via QAction. QAction is a class that provides a way to describe abstract user interfaces. What this means in English is that you can define multiple interface elements within a single object, unified by the effect that interacting with that element has.

For example, it is common to have functions that are represented in the toolbar but also the menu — think of something like Edit->Cut, which is present both in the Edit menu but also on the toolbar as a pair of scissors, and also through the keyboard shortcut Ctrl-X (Cmd-X on Mac).

Without QAction, you would have to define this in multiple places. But with QAction you can define a single QAction, defining the triggered action, and then add this action to both the menu and the toolbar. Each QAction has names, status messages, icons, and signals that you can connect to (and much more).

In the code below, you can see this first QAction added:

python
class MainWindow(QMainWindow):
    def __init__(self):
        super().__init__()
        self.setWindowTitle("My App")

        label = QLabel("Hello!")
        label.setAlignment(Qt.AlignmentFlag.AlignCenter)

        self.setCentralWidget(label)

        toolbar = QToolBar("My main toolbar")
        self.addToolBar(toolbar)

        button_action = QAction("Your button", self)
        button_action.setStatusTip("This is your button")
        button_action.triggered.connect(self.onMyToolBarButtonClick)
        toolbar.addAction(button_action)

    def onMyToolBarButtonClick(self, s):
        print("click", s)

To start with, we create the function that will accept the signal from the QAction so we can see if it is working. Next, we define the QAction itself. When creating the instance, we can pass a label for the action and/or an icon. You must also pass in any QObject to act as the parent for the action — here we're passing self as a reference to our main window. Strangely, for QAction the parent element is passed in as the final argument.

Next, we can opt to set a status tip — this text will be displayed on the status bar once we have one. Finally, we connect the triggered signal to the custom function. This signal will fire whenever the QAction is triggered (or activated).

Run it! You should see your button with the label that you have defined. Click on it, and then our custom method will print "click" and the status of the button.

Toolbar showing our QAction button. Toolbar showing our QAction button.

Why is the signal always false? The signal passed indicates whether the button is checked, and since our button is not checkable — just clickable — it is always false. We'll show how to make it checkable shortly.

Next, we can add a status bar.

We create a status bar object by calling QStatusBar to get a new status bar object and then passing this into setStatusBar. Since we don't need to change the status bar settings, we can also just pass it in as we create it, in a single line:

python
class MainWindow(QMainWindow):
    def __init__(self):
        super().__init__()
        self.setWindowTitle("My App")

        label = QLabel("Hello!")
        label.setAlignment(Qt.AlignmentFlag.AlignCenter)

        self.setCentralWidget(label)

        toolbar = QToolBar("My main toolbar")
        self.addToolBar(toolbar)

        button_action = QAction("Your button", self)
        button_action.setStatusTip("This is your button")
        button_action.triggered.connect(self.onMyToolBarButtonClick)
        toolbar.addAction(button_action)

        self.setStatusBar(QStatusBar(self))

    def onMyToolBarButtonClick(self, s):
        print("click", s)

Run it! Hover your mouse over the toolbar button, and you will see the status text in the status bar.

Status bar text is updated as we hover our actions. Status bar text updated as we hover over the action.

Next, we're going to turn our QAction toggleable — so clicking will turn it on, and clicking again will turn it off. To do this, we simply call setCheckable(True) on the QAction object:

python
class MainWindow(QMainWindow):
    def __init__(self):
        super().__init__()
        self.setWindowTitle("My App")

        label = QLabel("Hello!")
        label.setAlignment(Qt.AlignmentFlag.AlignCenter)

        self.setCentralWidget(label)

        toolbar = QToolBar("My main toolbar")
        self.addToolBar(toolbar)

        button_action = QAction("Your button", self)
        button_action.setStatusTip("This is your button")
        button_action.triggered.connect(self.onMyToolBarButtonClick)
        button_action.setCheckable(True)
        toolbar.addAction(button_action)

        self.setStatusBar(QStatusBar(self))

    def onMyToolBarButtonClick(self, s):
        print("click", s)

Run it! Click on the button to see it toggle from checked to unchecked state. Note that the custom slot method we create now alternates outputting True and False.

The toolbar button toggled on. The toolbar button toggled on.

There is also a toggled signal, which only emits a signal when the button is toggled. But the effect is identical, so it is mostly pointless.

Things look pretty shabby right now — so let's add an icon to our button. For this, I recommend you download the fugue icon set by designer Yusuke Kamiyamane. It's a great set of beautiful 16x16 icons that can give your apps a nice professional look. It is freely available with only attribution required when you distribute your application — although I am sure the designer would appreciate some cash too if you have some spare.

Fugue Icon Set — Yusuke Kamiyamane Fugue Icon Set — Yusuke Kamiyamane

Select an image from the set (in the examples here, I've selected the file bug.png) and copy it into the same folder as your source code.

We can create a QIcon object by passing the file name to the class, e.g. QIcon("bug.png") -- if you place the file in another folder, you will need a full relative or absolute path to it.

Finally, to add the icon to the QAction (and therefore the button), we simply pass it in as the first argument when creating the QAction.

You also need to let the toolbar know how large your icons are. Otherwise, your icon will be surrounded by a lot of padding. You can do this by calling setIconSize() with a QSize object:

python
class MainWindow(QMainWindow):
    def __init__(self):
        super().__init__()
        self.setWindowTitle("My App")

        label = QLabel("Hello!")
        label.setAlignment(Qt.AlignmentFlag.AlignCenter)

        self.setCentralWidget(label)

        toolbar = QToolBar("My main toolbar")
        toolbar.setIconSize(QSize(16, 16))
        self.addToolBar(toolbar)

        button_action = QAction(QIcon("bug.png"), "Your button", self)
        button_action.setStatusTip("This is your button")
        button_action.triggered.connect(self.onMyToolBarButtonClick)
        button_action.setCheckable(True)
        toolbar.addAction(button_action)

        self.setStatusBar(QStatusBar(self))

    def onMyToolBarButtonClick(self, s):
        print("click", s)

Run it! The QAction is now represented by an icon. Everything should work exactly as it did before.

Our action button with an icon. Our action button with an icon.

Note that Qt uses your operating system's default settings to determine whether to show an icon, text, or an icon and text in the toolbar. But you can override this by using setToolButtonStyle(). This slot accepts any of the following flags from the Qt namespace:

Flag Behavior
Qt.ToolButtonStyle.ToolButtonIconOnly Icon only, no text
Qt.ToolButtonStyle.ToolButtonTextOnly Text only, no icon
Qt.ToolButtonStyle.ToolButtonTextBesideIcon Icon and text, with text beside the icon
Qt.ToolButtonStyle.ToolButtonTextUnderIcon Icon and text, with text under the icon
Qt.ToolButtonStyle.ToolButtonFollowStyle Follow the host desktop style

The default value is Qt.ToolButtonStyle.ToolButtonFollowStyle, meaning that your application will default to following the standard/global setting for the desktop on which the application runs. This is generally recommended to make your application feel as native as possible.

Finally, we can add a few more bits and bobs to the toolbar. We'll add a second button and a checkbox widget. As mentioned, you can literally put any widget in here, so feel free to go crazy:

python
from PyQt6.QtCore import QSize, Qt
from PyQt6.QtGui import QAction, QIcon
from PyQt6.QtWidgets import (
    QApplication,
    QCheckBox,
    QLabel,
    QMainWindow,
    QStatusBar,
    QToolBar,
)

class MainWindow(QMainWindow):
    def __init__(self):
        super().__init__()
        self.setWindowTitle("My App")

        label = QLabel("Hello!")
        label.setAlignment(Qt.AlignmentFlag.AlignCenter)

        self.setCentralWidget(label)

        toolbar = QToolBar("My main toolbar")
        toolbar.setIconSize(QSize(16, 16))
        self.addToolBar(toolbar)

        button_action = QAction(QIcon("bug.png"), "&Your button", self)
        button_action.setStatusTip("This is your button")
        button_action.triggered.connect(self.onMyToolBarButtonClick)
        button_action.setCheckable(True)
        toolbar.addAction(button_action)

        toolbar.addSeparator()

        button_action2 = QAction(QIcon("bug.png"), "Your &button2", self)
        button_action2.setStatusTip("This is your button2")
        button_action2.triggered.connect(self.onMyToolBarButtonClick)
        button_action2.setCheckable(True)
        toolbar.addAction(button_action2)

        toolbar.addWidget(QLabel("Hello"))
        toolbar.addWidget(QCheckBox())

        self.setStatusBar(QStatusBar(self))

    def onMyToolBarButtonClick(self, s):
        print("click", s)

app = QApplication([])
window = MainWindow()
window.show()
app.exec()

Run it! Now you see multiple buttons and a checkbox.

Toolbar with an action and two widgets. Toolbar with an action and two widgets.

Menus are another standard component of UIs. Typically, they are at the top of the window or the top of a screen on macOS. They allow you to access all standard application functions. A few standard menus exist — for example File, Edit, Help. Menus can be nested to create hierarchical trees of functions, and they often support and display keyboard shortcuts for fast access to their functions.

Standard GUI elements - Menus Standard GUI elements - Menus

Adding a Menu

To create a menu, we create a menubar we call menuBar() on the QMainWindow. We add a menu to our menu bar by calling addMenu(), passing in the name of the menu. I've called it '&File'. The ampersand defines a quick key to jump to this menu when pressing Alt.

This won't be visible on macOS. Note that this is different from a keyboard shortcut — we'll cover that shortly.

This is where the power of actions comes into play. We can reuse the already existing QAction to add the same function to the menu. To add an action, you call addAction() passing in one of our defined actions:

python
class MainWindow(QMainWindow):
    def __init__(self):
        super().__init__()
        self.setWindowTitle("My App")

        label = QLabel("Hello!")
        label.setAlignment(Qt.AlignmentFlag.AlignCenter)

        self.setCentralWidget(label)

        toolbar = QToolBar("My main toolbar")
        toolbar.setIconSize(QSize(16, 16))
        self.addToolBar(toolbar)

        button_action = QAction(QIcon("bug.png"), "&Your button", self)
        button_action.setStatusTip("This is your button")
        button_action.triggered.connect(self.onMyToolBarButtonClick)
        button_action.setCheckable(True)
        toolbar.addAction(button_action)

        toolbar.addSeparator()

        button_action2 = QAction(QIcon("bug.png"), "Your &button2", self)
        button_action2.setStatusTip("This is your button2")
        button_action2.triggered.connect(self.onMyToolBarButtonClick)
        button_action2.setCheckable(True)
        toolbar.addAction(button_action2)

        toolbar.addWidget(QLabel("Hello"))
        toolbar.addWidget(QCheckBox())

        self.setStatusBar(QStatusBar(self))

        menu = self.menuBar()

        file_menu = menu.addMenu("&File")
        file_menu.addAction(button_action)

    def onMyToolBarButtonClick(self, s):
        print("click", s)

Run it! Click the item in the menu, and you will notice that it is toggleable — it inherits the features of the QAction.

Menu shown on the window -- on macOS this will be at the top of the screen. Menu shown on the window -- on macOS this will be at the top of the screen.

Let's add some more things to the menu. Here, we'll add a separator to the menu, which will appear as a horizontal line in the menu, and then add the second QAction we created:

python
class MainWindow(QMainWindow):
    def __init__(self):
        super().__init__()
        self.setWindowTitle("My App")

        label = QLabel("Hello!")
        label.setAlignment(Qt.AlignmentFlag.AlignCenter)

        self.setCentralWidget(label)

        toolbar = QToolBar("My main toolbar")
        toolbar.setIconSize(QSize(16, 16))
        self.addToolBar(toolbar)

        button_action = QAction(QIcon("bug.png"), "&Your button", self)
        button_action.setStatusTip("This is your button")
        button_action.triggered.connect(self.onMyToolBarButtonClick)
        button_action.setCheckable(True)
        toolbar.addAction(button_action)

        toolbar.addSeparator()

        button_action2 = QAction(QIcon("bug.png"), "Your &button2", self)
        button_action2.setStatusTip("This is your button2")
        button_action2.triggered.connect(self.onMyToolBarButtonClick)
        button_action2.setCheckable(True)
        toolbar.addAction(button_action2)

        toolbar.addWidget(QLabel("Hello"))
        toolbar.addWidget(QCheckBox())

        self.setStatusBar(QStatusBar(self))

        menu = self.menuBar()

        file_menu = menu.addMenu("&File")
        file_menu.addAction(button_action)
        file_menu.addSeparator()
        file_menu.addAction(button_action2)

    def onMyToolBarButtonClick(self, s):
        print("click", s)

Run it! You should see two menu items with a line between them.

Our actions showing in the menu. Our actions showing in the menu.

You can also use ampersand to add accelerator keys to the menu to allow a single key to be used to jump to a menu item when it is open. Again this doesn't work on macOS.

To add a submenu, you simply create a new menu by calling addMenu() on the parent menu. You can then add actions to it as usual. For example:

python
class MainWindow(QMainWindow):
    def __init__(self):
        super().__init__()
        self.setWindowTitle("My App")

        label = QLabel("Hello!")
        label.setAlignment(Qt.AlignmentFlag.AlignCenter)

        self.setCentralWidget(label)

        toolbar = QToolBar("My main toolbar")
        toolbar.setIconSize(QSize(16, 16))
        self.addToolBar(toolbar)

        button_action = QAction(QIcon("bug.png"), "&Your button", self)
        button_action.setStatusTip("This is your button")
        button_action.triggered.connect(self.onMyToolBarButtonClick)
        button_action.setCheckable(True)
        toolbar.addAction(button_action)

        toolbar.addSeparator()

        button_action2 = QAction(QIcon("bug.png"), "Your &button2", self)
        button_action2.setStatusTip("This is your button2")
        button_action2.triggered.connect(self.onMyToolBarButtonClick)
        button_action2.setCheckable(True)
        toolbar.addAction(button_action2)

        toolbar.addWidget(QLabel("Hello"))
        toolbar.addWidget(QCheckBox())

        self.setStatusBar(QStatusBar(self))

        menu = self.menuBar()

        file_menu = menu.addMenu("&File")
        file_menu.addAction(button_action)
        file_menu.addSeparator()

        file_submenu = file_menu.addMenu("Submenu")
        file_submenu.addAction(button_action2)

    def onMyToolBarButtonClick(self, s):
        print("click", s)

Run it! You will see a nested menu in the File menu.

Submenu nested in the File menu. Submenu nested in the File menu.

Finally, we'll add a keyboard shortcut to the QAction. You define a keyboard shortcut by passing setKeySequence() and passing in the key sequence. Any defined key sequences will appear in the menu.

Note that the keyboard shortcut is associated with the QAction and will still work whether or not the QAction is added to a menu or a toolbar.

Key sequences can be defined in multiple ways - either by passing as text, using key names from the Qt namespace, or using the defined key sequences from the Qt namespace. Use the latter wherever you can to ensure compliance with the operating system standards.

The completed code, showing the toolbar buttons and menus, is shown below:

python
class MainWindow(QMainWindow):
    def __init__(self):
        super().__init__()
        self.setWindowTitle("My App")

        label = QLabel("Hello!")

        # The `Qt` namespace has a lot of attributes to customize
        # widgets. See: http://doc.qt.io/qt-6/qt.html
        label.setAlignment(Qt.AlignmentFlag.AlignCenter)

        # Set the central widget of the Window. Widget will expand
        # to take up all the space in the window by default.
        self.setCentralWidget(label)

        toolbar = QToolBar("My main toolbar")
        toolbar.setIconSize(QSize(16, 16))
        self.addToolBar(toolbar)

        button_action = QAction(QIcon("bug.png"), "&Your button", self)
        button_action.setStatusTip("This is your button")
        button_action.triggered.connect(self.onMyToolBarButtonClick)
        button_action.setCheckable(True)
        # You can enter keyboard shortcuts using key names (e.g. Ctrl+p)
        # Qt.namespace identifiers (e.g. Qt.CTRL + Qt.Key_P)
        # or system agnostic identifiers (e.g. QKeySequence.Print)
        button_action.setShortcut(QKeySequence("Ctrl+p"))
        toolbar.addAction(button_action)

        toolbar.addSeparator()

        button_action2 = QAction(QIcon("bug.png"), "Your &button2", self)
        button_action2.setStatusTip("This is your button2")
        button_action2.triggered.connect(self.onMyToolBarButtonClick)
        button_action2.setCheckable(True)
        toolbar.addAction(button_action2)

        toolbar.addWidget(QLabel("Hello"))
        toolbar.addWidget(QCheckBox())

        self.setStatusBar(QStatusBar(self))

        menu = self.menuBar()

        file_menu = menu.addMenu("&File")
        file_menu.addAction(button_action)

        file_menu.addSeparator()

        file_submenu = file_menu.addMenu("Submenu")

        file_submenu.addAction(button_action2)

    def onMyToolBarButtonClick(self, s):
        print("click", s)

Experiment with building your own menus using QAction and QMenu.

March 29, 2025 06:00 AM UTC

March 28, 2025


Robin Wilson

Learning resources for GIS in Python with cloud-native geospatial, PostGIS and more

I recently gave a careers talk to students at Solent University, and through that I got to know a MSc student there who had previous GIS experience and was now doing a Data Analytics and AI MSc course. Her GIS experience was mostly in the ESRI stack (ArcGIS and related tools) and she was keen to learn other tools and how to combine her new Python and data knowledge with her previous GIS knowledge. I wrote her a long email with links to loads of resources and, with her permission, I’m publishing it here as it may be useful to others. The general focus is on the tools I use, which are mostly Python-focused, but also on becoming familiar with a range of tools rather than using tools from just one ecosystem (like ESRI). I hope it is useful to you.

Tools to investigate:

Python libraries to investigate:

Cloud Native Geospatial
There’s a good ‘zine’ that explains the basics behind cloud-native geospatial – see https://zines.developmentseed.org/zines/cloud-native/. Understanding the basics of the topics in there would be good. There are loads of good tutorials online for using STAC catalogues, COG files and so on. See https://planetarycomputer.microsoft.com/docs/quickstarts/reading-stac/ and https://planetarycomputer.microsoft.com/docs/tutorials/cloudless-mosaic-sentinel2/ and https://github.com/microsoft/PlanetaryComputerExamples/blob/main/tutorials/surface_analysis.ipynb

My Blog
You can subscribe via email on the left-hand side at the bottom of the sidebar
Relevant posts:

Conference talks
These can be a really good way to get a brief intro to a topic, to know where to delve in deeper later. I often put them on and half-listen while I’m doing something else, and then switch to focusing on them fully if they get particularly interesting. There are loads of links here, don’t feel like you have to look at them all!

PostGIS Day conference: https://www.crunchydata.com/blog/postgis-day-2024-summary
Particularly relevant talks:

FOSS4G UK conference last year in Bristol: https://uk.osgeo.org/foss4guk2024/bristol.html
Most relevant talks for you are the following (just the slides):

FOSS4G conference YouTube videos: https://www.youtube.com/@FOSS4G/videos – they have a load of ones from 2022 at the top for some reason, but if you scroll down a long way you can find 2023 and 2024 stuff. Actually, better is to use this playlist of talks from the 2023 global conference: https://www.youtube.com/playlist?list=PLqa06jy1NEM2Kna9Gt_LDKZHv1dl4xUoZ
Here’s a few talks that might be particularly interesting/relevant to you, in no particular order

Suggestions for learning projects/tasks
(These are quite closely related to the MSc project that this student might be doing, but are probably useful for people generally)
I know when you’re starting off it is hard to work out what sort of things to do to develop your skills. One thing that is really useful is to become a bit of a ‘tool polyglot’, so you can do the same task in various tools depending on what makes sense in the circumstances.

I’ve listed a couple of tasks below. I’d suggest trying to complete them in a few ways:

  1. Using QGIS and clicking around in the GUI
  2. Using Python libraries like geopandas, rasterio and so on
  3. Using PostGIS
  4. (Possibly – not essential) Using the QGIS command-line, or model builder or similar

Task 1 – Flood risk

  1. Download the ‘Flood Zone 2’ flood risk data from https://environment.data.gov.uk/explore/86ec354f-d465-11e4-b09e-f0def148f590?download=true for a particular area (maybe the whole of Southampton?)
  2. Download OS data on buildings from this page – https://automaticknowledge.org/gb/ – you can download it for a specific local authority area
  3. Find all buildings at risk of flooding, and provide a count of buildings at risk and a map of buildings at risk (static map or web map)
  4. Extension task: also provide a total ground area of buildings at risk

Task 2 – Elevation data
(Don’t do this with PostGIS as its raster functionality isn’t great, but you could probably do all of this with GDAL command-line tools if you wanted)

  1. Download Digital Terrain Model data from https://environment.data.gov.uk/survey – download multiple tiles
  2. Mosaic the tiles together into one large image file
  3. Do some basic processing on the DEM data. For example, try:
    a) Subtracting the minimum value, so the lowest elevation comes out as a value of zero
    b) Running a smoothing algorithm across the DEM to remove noise
  4. Produce a map – either static or web map

March 28, 2025 07:34 PM UTC

March 27, 2025


Test and Code

pytest-html - a plugin that generates HTML reports for test results

pytest-html has got to be one of my all time favorite plugins. 
pytest-html is a plugin for pytest that generates a HTML report for test results. 
This episode digs into some of the super coolness of pytest-html.


Sponsored by: 

★ Support this podcast on Patreon ★ <p>pytest-html has got to be one of my all time favorite plugins. <br>pytest-html is a plugin for pytest that generates a HTML report for test results. <br>This episode digs into some of the super coolness of pytest-html.</p><ul><li><a href="https://pytest-html.readthedocs.io/">pytest-html</a></li><li><a href="https://github.com/pytest-dev/pytest-html/blob/master/README.rst">repo readme with screenshot</a></li><li><a href="https://pytest-html.readthedocs.io/en/latest/user_guide.html#enhancing-reports">enhancing reports</a></li><li><a href="https://github.com/pytest-dev/pytest-metadata/tree/master">pytest-metadata</a></li></ul> <br><p><strong>Sponsored by: </strong></p><ul><li><a href="https://file+.vscode-resource.vscode-cdn.net/Users/brianokken/projects/test_and_code_notes/new_ad.md">The Complete pytest course</a> is now a bundle, with each part available separately.<ul><li><a href="https://courses.pythontest.com/pytest-primary-power">pytest Primary Power</a> teaches the super powers of pytest that you need to learn to use pytest effectively.</li><li><a href="https://courses.pythontest.com/using-pytest-with-projects">Using pytest with Projects</a> has lots of "when you need it" sections like debugging failed tests, mocking, testing strategy, and CI</li><li>Then <a href="https://courses.pythontest.com/pytest-booster-rockets">pytest Booster Rockets</a> can help with advanced parametrization and building plugins.</li></ul></li><li>Whether you need to get started with pytest today, or want to power up your pytest skills, <a href="https://courses.pythontest.com">PythonTest</a> has a course for you.<p></p></li></ul> <strong> <a href="https://www.patreon.com/c/testpodcast" rel="payment" title="★ Support this podcast on Patreon ★">★ Support this podcast on Patreon ★</a> </strong>

March 27, 2025 06:16 PM UTC


Python Anywhere

innit: a new system image, with Python 3.13 and Ubuntu 22.04

If you signed up for an account on PythonAnywhere after 25 March 2025, you’ll have Python versions 3.11, 3.12 and 3.13 available. Additionally, the underlying operating system for your account will be Ubuntu 22.04, rather than the 20.04 used by older accounts.

If you signed up before that date, you’ll be on an older “system image” – essentially the version of the operating system and the set of installed packages that you have access to. You can switch to the new system image from the “Account” page, but you may need to make changes to your code and/or virtualenvs to make everything work – there’s more information on that page.

This post has more details on what’s new in the “innit” system image. There’s a lot!

March 27, 2025 01:00 PM UTC


Real Python

Quiz: Using Python's .__dict__ to Work With Attributes

In this quiz, you’ll test your understanding of Using Python’s .__dict__ to Work With Attributes.

By working through this quiz, you’ll revisit how .__dict__ holds an object’s writable attributes, allowing for dynamic manipulation and introspection. You’ll also review how both vars() and .__dict__ let you inspect an object’s attributes, and the common use cases of .__dict__ in Python applications.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

March 27, 2025 12:00 PM UTC


Eli Bendersky

Notes on implementing Attention

Some notes on implementing attention blocks in pure Python + Numpy. The focus here is on the exact implementation in code, explaining all the shapes throughout the process. The motivation for why attention works is not covered here - there are plenty of excellent online resources explaining it.

Several papers are mentioned throughout the code; they are:

Basic scaled self-attention

We'll start with the most basic scaled dot product self-attention, working on a single sequence of tokens, without masking.

The input is a 2D array of shape (N, D). N is the length of the sequence (how many tokens it contains) and D is the embedding depth - the length of the embedding vector representing each token [1]. D could be something like 512, or more, depending on the model.

input array N by D

A self-attention module is parameterized with three weight matrices, Wk, Wq and Wv. Some variants also have accompanying bias vectors, but the AIAYN paper doesn't use them, so I'll skip them here. In the general case, the shape of each weight matrix is (D, HS), where HS is some fraction of D. HS stands for "head size" and we'll see what this means soon. This is a diagram of a self-attention module (the diagram assumes N=6, D is some large number and so is HS). In the diagram, @ stands for matrix multiplication (Python/Numpy syntax):

schematic of a single attention head

Here's a basic Numpy implementation of this:

# self_attention the way it happens in the Transformer model. No bias.
# D = model dimension/depth (length of embedding)
# N = input sequence length
# HS = head size
#
# x is the input (N, D), each token in a row.
# Each of W* is a weight matrix of shape (D, HS)
# The result is (N, HS)
def self_attention(x, Wk, Wq, Wv):
    # Each of these is (N, D) @ (D, HS) = (N, HS)
    q = x @ Wq
    k = x @ Wk
    v = x @ Wv

    # kq: (N, N) matrix of dot products between each pair of q and k vectors.
    # The division by sqrt(HS) is the scaling.
    kq = q @ k.T / np.sqrt(k.shape[1])

    # att: (N, N) attention matrix. The rows become the weights that sum
    # to 1 for each output vector.
    att = softmax_lastdim(kq)
    return att @ v  # (N, HS)

The "scaled" part is just dividing kq by the square root of HS, which is done to keep the values of the dot products manageable (otherwise they would grow with the size of the contracted dimension).

The only dependency is a function for calculating Softmax across the last dimension of an input array:

def softmax_lastdim(x):
    """Compute softmax across last dimension of x.

    x is an arbitrary array with at least two dimensions. The returned array has
    the same shape as x, but its elements sum up to 1 across the last dimension.
    """
    # Subtract the max for numerical stability
    ex = np.exp(x - np.max(x, axis=-1, keepdims=True))
    # Divide by sums across last dimension
    return ex / np.sum(ex, axis=-1, keepdims=True)

When the input is 2D, the "last dimension" is the columns. Colloquially, this Softmax function acts on each row of x separately; it applies the Softmax formula to the elements (columns) of the row, ending up with a row of numbers in the range [0,1] that all sum up to 1.

Another note on the dimensions: it's possible for the Wv matrix to have a different second dimension from Wq and Wk. If you look at the diagram, you can see this will work out, since the softmax produces (N, N), and whatever the second dimension of V is, will be the second dimension of the output. The AIAYN paper designates these dimensions as d_k and d_v, but in practice d_k=d_v in all the variants it lists. I found that these dimensions are typically the same in other papers as well. Therefore, for simplicity I just made them all equal to D in this post; if desired, a variant with different d_k and d_v is a fairly trivial modification to this code.

Batched self-attention

In the real world, the input array is unlikely to be 2D because models are trained on batches of input sequences. To leverage the parallelism of modern hardware, whole batches are typically processed in the same operation.

input array (B, N, D)

The batched version of scaled self-attention is very similar to the non-batched one, due to the magic of Numpy matrix multiplication and broadcasts. Now the input shape is (B, N, D), where B is the batch dimension. The W* matrices are still (D, HS); multiplying a (B, N, D) array by (D, HS) performs contraction between the last axis of the first array and the first axis of the second array, resulting in (B, N, HS). Here's the code, with the dimensions annotated for each operation:

# self_attention with inputs that have a batch dimension.
# x has shape (B, N, D)
# Each of W* has shape (D, D)
def self_attention_batched(x, Wk, Wq, Wv):
    q = x @ Wq  # (B, N, HS)
    k = x @ Wk  # (B, N, HS)
    v = x @ Wv  # (B, N, HS)

    kq = q @ k.swapaxes(-2, -1) / np.sqrt(k.shape[-1])  # (B, N, N)

    att = softmax_lastdim(kq)  # (B, N, N)
    return att @ v  # (B, N, HS)

Note that the only difference between this and the non-batched version is the line calculating kq:

  • Since k is no longer 2D, the notion of "transpose" is ambiguous so we explicitly ask to swap the last and the penultimate axis, leaving the first axis (B) intact.
  • When calculating the scaling factor we use k.shape[-1] to select the last dimension of k, instead of k.shape[1] which only selects the last dimension for 2D arrays.

In fact, this function could also calculate the non-batched version! From now on, we'll assume that all inputs are batched, and all operations are implicitly batched. I'm not going to be using the "batched" prefix or suffix on functions any more.

The basic underlying idea of the attention module is to shift around the multi-dimensional representations of tokens in the sequence towards a better representation of the entire sequence. The tokens attend to each other. Specifically, the matrix produced by the Softmax operation is called the attention matrix. It's (N, N); for each token it specifies how much information from every other token in the sequence should be taken into account. For example, a higher number in cell (R, C) means that there's a stronger relation of token at index R in the sequence to the token at index C.

Here's a nice example from the AIAYN paper, showing a word sequence and the weights produced by two attention heads (purple and brown) for a given position in the input sequence:

attention paper screenshot showing learned attention

This shows how the model is learning to resolve what the word "its" refers to in the sentence. Let's take just the purple head as an example. The index of token "its" in the sequence is 8, and the index of "Law" is 1. In the attention matrix for this head, the value at index (8, 1) will be very high (close to 1), with other values in the same row much lower.

While this intuitive explanation isn't critical to understand how attention is implemented, it will become more important when we talk about masked self-attention later on.

Multi-head attention

The attention mechanism we've seen so far has a single set of K, Q and V matrices. This is called one "head" of attention. In today's models, there are typically multiple heads. Each head does its attention job separately, and in the end all these results are concatenated and feed through a linear layer.

In what follows, NH is the number of heads and HS is the head size. Typically, NH times HS would be D; for example, the AIAYN paper mentions several configurations for D=512: NH=8 and HS=64, NH=32 and HS=16, and so on [2]. However, the math works out even if this isn't the case, because the final linear ("projection") layer maps the output back to (N, D).

Assuming the previous diagram showing a self-attention module is a single head with input (N, D) and output (N, HS), this is how multiple heads are combined:

schematic of multiple attention heads

Each of the (NH) heads has its own parameter weights for Q, K and V. Each attention head outputs a (N, HS) matrix; these are concatenated along the last dimension to (N, NH * HS), which is passed through a final linear projection.

Here's a function implementing (batched) multi-head attention; for now, please ignore the code inside do_mask conditions:

# x has shape (B, N, D)
# In what follows:
#   NH = number of heads
#   HS = head size
# Each W*s is a list of NH weight matrices of shape (D, HS).
# Wp is a weight matrix for the final linear projection, of shape (NH * HS, D)
# The result is (B, N, D)
# If do_mask is True, each attention head is masked from attending to future
# tokens.
def multihead_attention_list(x, Wqs, Wks, Wvs, Wp, do_mask=False):
    # Check shapes.
    NH = len(Wks)
    HS = Wks[0].shape[1]
    assert len(Wks) == len(Wqs) == len(Wvs)
    for W in Wqs + Wks + Wvs:
        assert W.shape[1] == HS
    assert Wp.shape[0] == NH * HS

    # List of head outputs
    head_outs = []

    if do_mask:
        # mask is a lower-triangular (N, N) matrix, with zeros above
        # the diagonal and ones on the diagonal and below.
        N = x.shape[1]
        mask = np.tril(np.ones((N, N)))

    for Wk, Wq, Wv in zip(Wks, Wqs, Wvs):
        # Calculate self attention for each head separately
        q = x @ Wq  # (B, N, HS)
        k = x @ Wk  # (B, N, HS)
        v = x @ Wv  # (B, N, HS)

        kq = q @ k.swapaxes(-2, -1) / np.sqrt(k.shape[-1])  # (B, N, N)

        if do_mask:
            # Set the masked positions to -inf, to ensure that a token isn't
            # affected by tokens that come after it in the softmax.
            kq = np.where(mask == 0, -np.inf, kq)

        att = softmax_lastdim(kq)  # (B, N, N)
        head_outs.append(att @ v)  # (B, N, HS)

    # Concatenate the head outputs and apply the final linear projection
    all_heads = np.concatenate(head_outs, axis=-1)  # (B, N, NH * HS)
    return all_heads @ Wp  # (B, N, D)

It is possible to vectorize this code even further; you'll sometimes see the heads laid out in a separate (4th) dimension instead of being a list. See the Vectorizing across the heads dimension section.

Masked (or Causal) self-attention

Attention modules can be used in both encoder and decoder blocks. Encoder blocks are useful for things like language understanding or translation; for these, it makes sense for each token to attend to all the other tokens in the sequence.

However, for generative models this presents a problem: if during training a word attends to future words, the model will just "cheat" and not really learn how to generate the next word from only past words. This is done in a decoder block, and for this we need to add masking to attention.

Conceptually, masking is very simple. Consider the sentence:

People like watching funny cat videos

When our attention code generates the att matrix, it's a square (N, N) matrix with attention weights from each token to each other token in the sequence:

attention masking

What we want is for all the gray cells in this matrix to be zero, to ensure that a token doesn't attend to future tokens. The blue cells in the matrix add up to 1 in each row, after the softmax operation.

Now take a look at the previous code sample and see what happens when do_mask=True:

  1. First, a (N, N) lower-triangular array is prepared with zeros above the diagonal and ones on the diagonal and below.
  2. Then, before we pass the scaled QK^T to softmax, we set its values to -\infty wherever the mask matrix is 0. This ensures that the softmax function will assign zeros to outputs at these indices, while still producing the proper values in the rest of the row.

Another name for masked self-attention is causal self-attention. This is a very good name that comes from causal systems in control theory.

Cross-attention

So far we've been working with self-attention blocks, where the self suggests that elements in the input sequence attend to other elements in the same input sequence.

Another variant of attention is cross-attention, where elements of one sequence attend to elements in another sequence. This variant exists in the decoder block of the AIAYN paper. This is a single head of cross-attention:

cross-attention with different Nq, Nv

Here we have two sequences with potentially different lengths: xq and xv. xq is used for the query part of attention, while xv is used for the key and value parts. The rest of the dimensions remain as before. The output of such a block is shaped (Nq, HS).

This is an implementation of multi-head cross-attention; it doesn't include masking, since masking is not typically necessary in cross attention - it's OK for elements of xq to attend to all elements of xv [3]:

# Cross attention between two input sequences that can have different lengths.
# xq has shape (B, Nq, D)
# xv has shape (B, Nv, D)
# In what follows:
#   NH = number of heads
#   HS = head size
# Each W*s is a list of NH weight matrices of shape (D, HS).
# Wp is a weight matrix for the final linear projection, of shape (NH * HS, D)
# The result is (B, Nq, D)
def multihead_cross_attention_list(xq, xv, Wqs, Wks, Wvs, Wp):
    # Check shapes.
    NH = len(Wks)
    HS = Wks[0].shape[1]
    assert len(Wks) == len(Wqs) == len(Wvs)
    for W in Wqs + Wks + Wvs:
        assert W.shape[1] == HS
    assert Wp.shape[0] == NH * HS

    # List of head outputs
    head_outs = []

    for Wk, Wq, Wv in zip(Wks, Wqs, Wvs):
        q = xq @ Wq  # (B, Nq, HS)
        k = xv @ Wk  # (B, Nv, HS)
        v = xv @ Wv  # (B, Nv, HS)

        kq = q @ k.swapaxes(-2, -1) / np.sqrt(k.shape[-1])  # (B, Nq, Nv)

        att = softmax_lastdim(kq)  # (B, Nq, Nv)
        head_outs.append(att @ v)  # (B, Nq, HS)

    # Concatenate the head outputs and apply the final linear projection
    all_heads = np.concatenate(head_outs, axis=-1)  # (B, Nq, NH * HS)
    return all_heads @ Wp  # (B, Nq, D)

Vectorizing across the heads dimension

The multihead_attention_list implementation shown above uses lists of weight matrices as input. While this makes the code clearer, it's not a particularly friendly format for an optimized implementation - especially on accelerators like GPUs and TPUs. We can vectorize it further by creating a new dimension for attention heads.

To understand the trick being used, consider a basic matmul of (8, 6) by (6, 2):

basic matrix multiplication

Now suppose we want to multiply our LHS by another (6, 2) matrix. We can do it all in the same operation by concatenating the two RHS matrices along columns:

concatenated basic matrix multiplication

If the yellow RHS block in both diagrams is identical, the green block of the result will be as well. And the violet block is just the matmul of the LHS by the red block of the RHS. This stems from the semantics of matrix multiplication, and is easy to verify on paper.

Now back to our multi-head attention. Note that we multiply the input x by a whole list of weight matrices - in fact, by three lists (one list for Q, one for K, and another for V). We can use the same vectorization technique by concatenating all these weight matrices into a single one. Assuming that NH * HS = D, the shape of the combined matrix is (D, 3 * D). Here's the vectorized implementation:

# x has shape (B, N, D)
# In what follows:
#   NH = number of heads
#   HS = head size
#   NH * HS = D
# W is expected to have shape (D, 3 * D), with all the weight matrices for
# Qs, Ks, and Vs concatenated along the last dimension, in this order.
# Wp is a weight matrix for the final linear projection, of shape (D, D).
# The result is (B, N, D).
# If do_mask is True, each attention head is masked from attending to future
# tokens.
def multihead_attention_vec(x, W, NH, Wp, do_mask=False):
    B, N, D = x.shape
    assert W.shape == (D, 3 * D)
    qkv = x @ W  # (B, N, 3 * D)
    q, k, v = np.split(qkv, 3, axis=-1)  # (B, N, D) each

    if do_mask:
        # mask is a lower-triangular (N, N) matrix, with zeros above
        # the diagonal and ones on the diagonal and below.
        mask = np.tril(np.ones((N, N)))

    HS = D // NH
    q = q.reshape(B, N, NH, HS).transpose(0, 2, 1, 3)  # (B, NH, N, HS)
    k = k.reshape(B, N, NH, HS).transpose(0, 2, 1, 3)  # (B, NH, N, HS)
    v = v.reshape(B, N, NH, HS).transpose(0, 2, 1, 3)  # (B, NH, N, HS)

    kq = q @ k.swapaxes(-1, -2) / np.sqrt(k.shape[-1])  # (B, NH, N, N)

    if do_mask:
        # Set the masked positions to -inf, to ensure that a token isn't
        # affected by tokens that come after it in the softmax.
        kq = np.where(mask == 0, -np.inf, kq)

    att = softmax_lastdim(kq)  # (B, NH, N, N)
    out = att @ v  # (B, NH, N, HS)
    return out.transpose(0, 2, 1, 3).reshape(B, N, D) @ Wp  # (B, N, D)

This code computes Q, K and V in a single matmul, and then splits them into separate arrays (note that on accelerators these splits and later transposes may be very cheap or even free as they represent a different access pattern into the same data).

Each of Q, K and V is initially (B, N, D), so they are reshaped into a more convenient shape by first splitting the D into (NH, HS), and finally changing the order of dimensions to get (B, NH, N, HS). In this format, both B and NH are considered batch dimensions that are fully parallelizable. The QK^T computation can then proceed as before, and Numpy will automatically perform the matmul over all the batch dimensions.

Sometimes you'll see an alternative notation used in papers for these matrix multiplications: numpy.einsum. For example, in our last code sample the computation of kq could also be written as:

kq = np.einsum("bhqd,bhkd->bhqk", q, k) / np.sqrt(k.shape[-1])

See this post for my detailed notes on this notation.

Code

The full code for these samples, with tests, is available in this repository.


[1]In LLM papers, D is often called d_{model}.
[2]In the GPT-3 paper, this is also true for all model variants. For example, the largest 175B model has NH=96, HS=128 and D=12288.
[3]It's also not as easy to define mathematically: how do we make a non-square matrix triangular? And what does it mean when the lengths of the two inputs are different?

March 27, 2025 07:17 AM UTC


Armin Ronacher

Rust Any Part 3: Finally we have Upcasts

Three years ago I shared the As-Any Hack on this blog. That hack is a way to get upcasting to supertraits working on stable Rust. To refresh your memory, the goal was to make something like this work:

#[derive(Debug)]
struct AnyBox(Box<dyn DebugAny>);

trait DebugAny: Any + Debug {}

impl<T: Any + Debug + 'static> DebugAny for T {}

The problem? Even though DebugAny inherits from Any, Rust wouldn't let you use methods from Any on a dyn DebugAny. So while you could call DebugAny methods just fine, trying to use downcast_ref from Any (the reason to use Any in the first place) would fail:

fn main() {
    let any_box = AnyBox(Box::new(42i32));
    dbg!(any_box.0.downcast_ref::<i32>());  // Compile error
}

The same would happen if we tried to cast it into an &dyn Any? A compile error again:

fn main() {
    let any_box = AnyBox(Box::new(42i32));
    let any = &*any_box.0 as &dyn Any;
    dbg!(any.downcast_ref::<i32>());
}

But there is good news! As of Rust 1.86, this is finally fixed. The cast now works:

[src/main.rs:14:5] any.downcast_ref::<i32>() = Some(
    42,
)

At the time of writing, this fix is in the beta channel, but stable release is just around the corner. That means a lot of old hacks can finally be retired. At least once your MSRV moves up.

Thank you so much to everyone who worked on this to make it work!

March 27, 2025 12:00 AM UTC


meejah.ca

Magic Wormhole is What?

Various levels of details regarding a secure peer connection technology

March 27, 2025 12:00 AM UTC

March 26, 2025


Python Morsels

Checking whether iterables are equal in Python

You can check whether iterables contain the same elements in Python with equality checks, type conversions, sets, Counter, or looping helpers.

Table of contents

  1. Simple equality checks
  2. Comparing different types of iterables
  3. Checking equality between large iterables
  4. Checking for near-equality
  5. Ignoring order when comparing iterables
  6. Comparing iterables isn't just about equality

Simple equality checks

If we have two lists and we wanted to know whether the items in these two lists are the same, we could use the equality operator (==):

>>> lines1 = ["Grains", "Kindred", "Zia"]
>>> lines2 = ["Grains", "Kindred", "Zia"]
>>> lines1 == lines2
True

The same thing works for comparing tuples:

>>> p = (3, 4, 8)
>>> q = (3, 5, 7)
>>> p == q
False

But what if we wanted to compare a list and a tuple?

We can't use a simple equality check for that:

>>> lines1 = ["Grains", "Kindred", "Zia"]
>>> lines2 = ("Grains", "Kindred", "Zia")
>>> lines1 == lines2
False

Comparing different types of iterables

To compare the items in …

Read the full article: https://www.pythonmorsels.com/iterable-equality/

March 26, 2025 11:00 PM UTC


Mirek Długosz

Interesting bugs: peculiar intermittent failure in testing pipeline

Over the years I have encountered my share of memorable problems. They were remarkably complex, hard to debug, completely obvious in retrospect, or plain funny. This is the story of one of them.

At the beginning, there was a suite of automated tests that I was maintaining. One day one of them failed. Not a big deal, unfortunate reality is that some of them fail sometimes for various reasons. Usually they pass when run again and we can blame unreliable infrastructure, transient networking issue or misalignment of the stars. But few days later the same test failed again. And then again. It was clear that there’s something going on and this particular test is intermittently failing. I had to figure out what is happening and how can I make the test provide the same result reliably.

(Note the choice of words here. My goal was not to make the test passing, or “green”. There might as well have been a bug in the test itself, or in the product. At this point nobody knew. The main goal was understanding the issue and making sure test is reliably providing the same result - whether it is pass or fail.)

Before we move on, there’s some relevant context that I need to share. That suite contained only UI tests. Running them all took about an hour. They were running against staging environment few times a day. The test that was failing was responsible for checking a chart which plots the data from last 30 days. There were other tests verifying other charts, sometimes using different time spans. The website used the same generic chart component in all cases. These other tests never failed.

On a high level, the failing test consisted of three main steps: request the data from last 30 days using the API, read the data from the graph on the website, and compare both. Test was considered failed if there was any difference between the data from these two sources. Python deepdiff package was used for comparison. To make it possible, data from API was locally transformed to mimic the structure returned by function responsible for reading the data from UI.

Testing infrastructure had few distinct pieces. There was a Jenkins server that triggered a test suite run at certain times of the day. Job executors were containers in a Kubernetes cluster. To facilitate UI testing, there was a Selenium Grid server with few workers hosted as virtual machines on OpenStack. Tests were running against staging environment of the product, which was also hosted on a Kubernetes cluster, but different than the one where job executors were. I believe all that was scattered across two data centers, with most of testing infrastructure being co-located, and product under test being elsewhere.

Not necessarily accurate illustration of infrastructure. Not necessarily accurate illustration of infrastructure.

Now, let’s get back to the story.

The very first thing I did was looking into test logs. Unfortunately, differences between objects as reported by deepdiff in this particular case are not easy to read (see below). The amount of data is overwhelming, and displaying everything in single line contributes to the challenge. The log made it clear that lists returned by API and read from UI are different, but it was not immediately obvious where exactly these differences are.

>       assert not deepdiff.DeepDiff(expected_graph_data, actual_graph_data)
E       assert not {'values_changed': {"root[0]['Date']": {'new_value': '1970-01-01', 'old_value': '1970-01-02'}, "root[0]['Foo']": {'new_value': 46, 'old_value': 23}, "root[0]['Bar']": {'new_value': 60, 'old_value': 99}, "root[0]['Total']": {'new_value': 106, 'old_value': 122}, "root[1]['Date']": {'new_value': '1970-01-02', 'old_value': '1970-01-03'}, "root[1]['Foo']": {'new_value': 23, 'old_value': 26}, "root[1]['Bar']": {'new_value': 99, 'old_value': 92}, "root[1]['Total']": {'new_value': 122, 'old_value': 118}, "root[2]['Date']": {'new_value': '1970-01-03', 'old_value': '1970-01-04'}, "root[2]['Foo']": {'new_value': 26, 'old_value': 49}, "root[2]['Bar']": {'new_value': 92, 'old_value': 86}, "root[2]['Total']": {'new_value': 118, 'old_value': 135}, "root[3]['Date']": {'new_value': '1970-01-04', 'old_value': '1970-01-05'}, "root[3]['Foo']": {'new_value': 49, 'old_value': 68}, "root[3]['Bar']": {'new_value': 86, 'old_value': 60}, "root[3]['Total']": {'new_value': 135, 'old_value': 128}, "root[4]['Date']": {'new_value': '1970-01-05', 'old_value': '1970-01-06'}, "root[4]['Foo']": {'new_value': 68, 'old_value': 33}, "root[4]['Bar']": {'new_value': 60, 'old_value': 14}, "root[4]['Total']": {'new_value...ue': 25}, "root[24]['Bar']": {'new_value': 29, 'old_value': 78}, "root[24]['Total']": {'new_value': 106, 'old_value': 103}, "root[25]['Date']": {'new_value': '1970-01-26', 'old_value': '1970-01-27'}, "root[25]['Foo']": {'new_value': 25, 'old_value': 57}, "root[25]['Bar']": {'new_value': 78, 'old_value': 84}, "root[25]['Total']": {'new_value': 103, 'old_value': 141}, "root[26]['Date']": {'new_value': '1970-01-27', 'old_value': '1970-01-28'}, "root[26]['Foo']": {'new_value': 57, 'old_value': 48}, "root[26]['Bar']": {'new_value': 84, 'old_value': 18}, "root[26]['Total']": {'new_value': 141, 'old_value': 66}, "root[27]['Date']": {'new_value': '1970-01-28', 'old_value': '1970-01-29'}, "root[27]['Foo']": {'new_value': 48, 'old_value': 89}, "root[27]['Bar']": {'new_value': 18, 'old_value': 14}, "root[27]['Total']": {'new_value': 66, 'old_value': 103}, "root[28]['Date']": {'new_value': '1970-01-29', 'old_value': '1970-01-30'}, "root[28]['Foo']": {'new_value': 89, 'old_value': 61}, "root[28]['Bar']": {'new_value': 14, 'old_value': 66}, "root[28]['Total']": {'new_value': 103, 'old_value': 127}}, 'iterable_item_added': {'root[29]': {'Date': '1970-01-30', 'Foo': 61, 'Bar': 66, 'Total': 127}}}

Trying to understand this log felt daunting, so my next step was running the failing test locally, in isolation. Predictably, it passed. I didn’t have the high hopes that I will be able to reproduce the problem right away, but that was a cheap thing to try, so I think it was worth giving a shot.

At this point I decided there is no way around it and I have to better understand how API and UI responses are different. I copied the log line into editor and inserted a new line character after each },. Few more changes later I had a form that was a little easier to decipher.

Deepdiff shows the differences between elements under the same index in lists. But focusing on elements with the same date value revealed that they are fundamentally the same. Values appearing under “old_value” in one list appears as “new_value” in the other list, just under different index. I have put colored overlay on the screenshot below to make it easier to see. You can think of these lists as mostly the same, but one is shifted when compared to other; or you can say that one list has extra element added at the end, while the other has extra element added at the very beginning. Specifically, API read data from January 2nd to February 1st, but UI displayed data from January 1st to January 31st. There’s a large overlap, but deepdiff output obscured this key insight.

Deepdiff output after editing. Color overlays shows that both lists have the same data, but in different places. Deepdiff output after editing. Color overlays shows that both lists have the same data, but in different places.

At this point I had an idea what is wrong, but I had no clue why, and why it would affect only this one single test. So in the next step I wanted to see if there are any patterns to the failure. I grabbed test results from last few weeks and put them in the spreadsheet. I added columns for basic things, like the result itself, how long did it take for test to finish, date and time when test was run. To make failing tests visually distinct, I added background color to highlight them. In separate column I tagged all rows where test was running for a first time in a given day. Then I added columns representing known issues that we encountered in previous few weeks, to see if all failures fall into one of them.

While there wasn’t a clear and predictable pattern, I did notice a curious thing - if the test failed, it would fail in the first run of a given day. Subsequent runs of any day never failed. And the first run in a day always started shortly after midnight UTC.

Test results in a spreadsheet

That allowed me to construct a working hypothesis: the issue is somehow related to time and there’s only a short window when it may occur, maybe up to few hours. That window is located around midnight UTC. Such hypothesis explains why subsequent pipeline runs never failed, and why I was never successful at reproducing the issue locally - I am located east of UTC line and I would have to try running the test way outside of working hours. Of course I didn’t know if I was up to something or I was just creating complex ad hoc hypothesis that fits the data. But it directed my next step.

To corroborate the hypothesis I needed some new information, things I didn’t have before. To gather it, I have added further logging in the test suite. First, I have used Selenium JavaScript execution capability to obtain the date and time as the browser “sees” it. Then I have done the same from Python, which both drives Selenium and requests data from API. The important part is that Python code is executed directly on test runner (container in Kubernetes) and JavaScript code is executed in the browser (Selenium Grid VM on OpenStack).

diff --git package/tests/ui/test_failing.py package/tests/ui/test_failing.py
index 1234567..abcdef0 100644
--- package/tests/ui/test_failing.py
+++ package/tests/ui/test_failing.py
@@ -10,6 +10,13 @@ def test_failing_test(user_app, some_fixture):
     """
     view = navigate_to(user_app.some_app, "SomeViewName")
+    browser_time_string = view.browser.execute_script("return new Date().toTimeString()")
+    browser_utc_string = view.browser.execute_script("return new Date().toUTCString()")
+    view.logger.info(
+        "[JavaScript] Time right now: %s ; UTC time: %s",
+        browser_time_string,
+        browser_utc_string,
+    )
     expected_x_axis = get_xaxis_values()
     view.items.item_select(some_value)
     view.graph.wait_displayed()
diff --git package/utils/utils.py package/utils/utils.py
index 1234567..abcdef0 100644
--- package/utils/utils.py
+++ package/utils/utils.py
@@ -10,6 +10,14 @@ METRIC_MAP = {


 def _get_dates_range(some_param="Daily", date=None):
+    current_time = arrow.now()
+    log.info(
+        "[Python] Time right now: %s ; TZ name: %s ; TZ offset: %s ; UTC time: %s",
+        current_time,
+        current_time.strftime("%Z"),
+        current_time.strftime("%z"),
+        arrow.utcnow(),
+    )
     try:
         date = arrow.get(date)
     except TypeError:

With the above patch applied and deployed, all I needed to do was waiting for the next failure. I hoped that new logs would reveal some more information once it fails again.

That turned out to be true. JavaScript showed a date one day earlier than Python. In fact, the time in JavaScript was about 15 minutes earlier than in Python. So if test suite ran around midnight, and we got to offending test within 15 minutes of suite start, then Python would request data through API for some dates, but website in browser would think it is still the previous day, and request different set of dates. It means that the window where issue occurs is extremely small - just around 15 minutes each day.

[JavaScript] Time right now: Thu Jan 01 1970 23:58:17 GMT+0000 (Coordinated Universal Time) ; UTC time: Thu, 01 Jan 1970 23:58:17GMT
[Python] Time right now: 1970-01-02T00:14:36.042473+00:00 ; TZ name: UTC ; TZ offset: +0000 ; UTC time: 1970-01-02T00:14:36.042473+00:00

This concludes the main part of the debugging story - at this point we knew what is wrong, we knew that failure is not caused by a bug in a test or a product, and it was clear that the solution is for all machines involved in testing to reconcile date and time. It also seemed like the JavaScript shows wrong date, which might mean that the issue is with Selenium Grid machines or OpenStack instance.

I connected to all Selenium Grid machines using SSH and checked their local time using date command. They were about 15 minutes behind their wall-clock time. I assumed the difference is caused by various OpenStack and underlying infrastructure maintenance work, so I just used hwclock to force OS clock to synchronize with hardware clock and moved on with my day.

Couple of days later I connected to these machines again and noticed that the local time is behind again, but only by about 15 seconds. It looked like the local clock is drifting by about 5 seconds a day. It might not sound like much, but it also meant that it’s only a matter of time before original issue happens again. Clearly someone logging in to these machines every once in a while and resetting clock would not be a good long term solution - we needed something that can automatically keep time synchronized.

That something is called NTP and all the machines already had chrony installed. However, it didn’t seem to work correctly. While the commands succeeded and logs did not show any problems, the clock would just not change. After few frustrating hours I think I ruled out all possible causes at the operating system level and came to the conclusion that perhaps the NTP traffic to public servers is blocked by data center firewall. I reached out to OpenStack administrators for help and they told me that there is a blessed NTP server instance inside the data center that I should use. Once I configured chrony to use it as a source, everything finally worked.

This way browsers started to consistently report the same time as Python executors. That fixed the original issue and we did not observe any test failures caused by it again.

March 26, 2025 06:22 PM UTC


Real Python

Introducing DuckDB

The DuckDB database provides a seamless way to handle large datasets in Python with Online Analytical Processing (OLAP) optimization. You can create databases, verify data imports, and perform efficient data queries using both SQL and DuckDB’s Python API.

By the end of this tutorial, you’ll understand that:

  • You can create a DuckDB database by reading data from files like Parquet, CSV, or JSON and saving it to a table.
  • You query a DuckDB database using standard SQL syntax within Python by executing queries through a DuckDB connection object.
  • You can also use DuckDB’s Python API, which uses method chaining for an object-oriented approach to database queries.
  • Concurrent access in DuckDB allows multiple reads but restricts concurrent writes to ensure data integrity.
  • DuckDB integrates with pandas and Polars by converting query results into DataFrames using the .df() or .pl() methods.

The tutorial will equip you with the practical knowledge necessary to get started with DuckDB, including its Online Analytical Processing (OLAP) features, which enable fast access to data through query optimization and buffering.

Ideally, you should already have a basic understanding of SQL, particularly how its SELECT keyword can be used to read data from a relational database. However, the SQL language is very user-friendly, and the examples used here are self-explanatory.

Now, it’s time for you to start learning why there’s a growing buzz surrounding DuckDB.

Get Your Code: Click here to download the free sample code that shows you how to use DuckDB in Python.

Take the Quiz: Test your knowledge with our interactive “Introducing DuckDB” quiz. You’ll receive a score upon completion to help you track your learning progress:


Interactive Quiz

Introducing DuckDB

This quiz will challenge your knowledge of working with DuckDB. You won't find all the answers in the tutorial, so you'll need to do some extra investigation. By finding all the answers, you're sure to learn some interesting things along the way.

Getting Started With DuckDB

To use DuckDB, you first need to install it. Fortunately, DuckDB is self-contained, meaning it won’t interfere with your existing Python environment.

You use python -m pip install duckdb to install it from the command prompt. If you’re working in a Jupyter Notebook, the command becomes !python -m pip install duckdb. The supporting downloadable code for this tutorial is also presented in a Jupyter Notebook.

Once the installation is complete, you can quickly test your installation with a query:

Python
>>> import duckdb

>>> duckdb.sql("SELECT 'whistling_duck' AS waterfowl, 'whistle' AS call")
┌────────────────┬─────────┐
│   waterfowl    │  call   │
│    varchar     │ varchar │
├────────────────┼─────────┤
│ whistling_duck │ whistle │
└────────────────┴─────────┘
Copied!

To test that everything works, you first import the duckdb library before running a test SQL query. In SQL, a query is a command you use to interact with the data in your database. You commonly use queries to view, add, update, and delete your data.

In this example, you write a SQL SELECT statement to view some data defined by the query. By passing it to the sql() function, you run the query and produce the result shown.

Your query creates a table with two columns named waterfowl and call. These contain the data "whistling_duck" and "whistle", respectively. The data types of both columns are varchar, which is the data type DuckDB uses to store variable-length character strings. Running your query using duckdb.sql() uses the default in-memory database. This means that the data are temporary and will disappear when you end your Python session.

If you see the output shown above, your installation is working perfectly.

Note: DuckDB queries are not case-sensitive. However, writing reserved SQL keywords in uppercase is standard practice. Also, a terminating semicolon (;) is optional in SQL and isn’t used in this tutorial, though you may encounter it elsewhere.

Now that you know how to set things up, it’s time to dive into some of the features that make DuckDB easy to use. In the next section, you’ll create a database table using data imported from an existing file. You’ll also learn how to check that the data has been imported correctly.

Creating a Database From a Data Source

While it’s possible to create database tables using SQL, it’s more common to read data from an external file, perhaps one containing data you’ve extracted from another system, and allow DuckDB to create and populate the table.

DuckDB supports reading from and writing to a range of common file types such as Parquet, CSV, and JSON. In this example, you’ll use data stored in the presidents.parquet Parquet file included in your downloadable materials to create a table.

The presidents.parquet file contains the following six fields:

Heading Meaning Data Type
sequence Order of presidency int64
last_name President’s last name varchar
first_name President’s first name varchar
term_start Start of presidency term date
term_end End of presidency term date
party_id Number representing political party int64

When you import data, it gets placed into a DuckDBPyRelation object. In DuckDB, a relation stores a query definition but not its data. To see the data your relation represents, you must do so interactively by viewing it or running an SQL query against it to see specific data.

Read the full article at https://realpython.com/python-duckdb/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

March 26, 2025 02:00 PM UTC


Python GUIs

Multithreading PySide6 applications with QThreadPool — Run background tasks concurrently without impacting your UI

A common problem when building Python GUI applications is the interface "locking up" when attempting to perform long-running background tasks. In this tutorial, we'll cover quick ways to achieve concurrent execution in PySide6.

If you'd like to run external programs (such as command-line utilities) from your applications, check out the Using QProcess to run external programs tutorial.

Background: The frozen GUI issue

Applications based on Qt (like most GUI applications) are based on events. This means that execution is driven in response to user interaction, signals, and timers. In an event-driven application, clicking a button creates an event that your application subsequently handles to produce some expected output. Events are pushed onto and taken off an event queue and processed sequentially.

In PySide6, we create an app with the following code:

python
app = QApplication([])
window = MainWindow()
app.exec()

The event loop starts when you call .exec() on the QApplication object and runs within the same thread as your Python code. The thread that runs this event loop — commonly referred to as the GUI thread — also handles all window communication with the host operating system.

By default, any execution triggered by the event loop will also run synchronously within this thread. In practice, this means that the time your PySide6 application spends doing something, the communication with the window and the interaction with the GUI are frozen.

If what you're doing is simple, and it returns control to the GUI loop quickly, the GUI freeze will be imperceptible to the user. However, if you need to perform longer-running tasks, for example, opening and writing a large file, downloading some data, or rendering a high-resolution image, there are going to be problems.

To your user, the application will appear to be unresponsive (because it is). Because your app is no longer communicating with the OS, on macOS, if you click on your app, you will see the spinning wheel of death. And, nobody wants that.

The solution is to move your long-running tasks out of the GUI thread into another thread. PySide6 provides a straightforward interface for this.

Preparation: A minimal stub app

To demonstrate multi-threaded execution, we need an application to work with. Below is a minimal stub application for PySide6 that will allow us to demonstrate multithreading and see the outcome in action. Simply copy and paste this into a new file and save it with an appropriate filename, like multithread.py. The remainder of the code will be added to this file. There is also a complete working example at the bottom if you're impatient:

python
import time

from PySide6.QtCore import (
    QTimer,
)
from PySide6.QtWidgets import (
    QApplication,
    QLabel,
    QMainWindow,
    QPushButton,
    QVBoxLayout,
    QWidget,
)

class MainWindow(QMainWindow):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.counter = 0

        layout = QVBoxLayout()

        self.label = QLabel("Start")
        button = QPushButton("DANGER!")
        button.pressed.connect(self.oh_no)

        layout.addWidget(self.label)
        layout.addWidget(button)

        w = QWidget()
        w.setLayout(layout)
        self.setCentralWidget(w)

        self.show()

        self.timer = QTimer()
        self.timer.setInterval(1000)
        self.timer.timeout.connect(self.recurring_timer)
        self.timer.start()

    def oh_no(self):
        time.sleep(5)

    def recurring_timer(self):
        self.counter += 1
        self.label.setText(f"Counter: {self.counter}")

app = QApplication([])
window = MainWindow()
app.exec()

Run the app as for any other Python application:

sh
$ python multithread.py

You will see a demonstration window with a number counting upwards. This count is generated by a simple recurring timer, firing once per second. Think of this as our event loop indicator (or GUI thread indicator), a simple way to let us know that our application is ticking over normally. There is also a button with the word "DANGER!. Push it.

You'll notice that each time you push the button, the counter stops ticking, and your application freezes entirely. On Windows, you may see the window turn pale, indicating it is not responding, while on macOS, you'll get the spinning wheel of death.

The wrong approach

Avoid doing this in your code.

What appears as a frozen interface is the main Qt event loop being blocked from processing (and responding to) window events. Your clicks on the window are still registered by the host OS and sent to your application, but because it's sat in your big ol' lump of code (calling time.sleep()), it can't accept or react to them. They have to wait until your code passes control back to Qt.

The quickest and perhaps most logical way to get around this issue is to accept events from within your code. This allows Qt to continue to respond to the host OS and your application will stay responsive. You can do this easily by using the static processEvents() method on the QApplication class.

For example, our long-running code time.sleep() could be broken down into five 1-second sleeps and insert the processEvents() in between. The code for this would be:

python
def oh_no(self):
    for n in range(5):
        QApplication.processEvents()
        time.sleep(1)

Now, when you push the DANGER! button, your app runs as before. However, now QApplication.processEvents() intermittently passes control back to Qt, and allows it to respond to events as normal. Qt will then accept events and handle them before returning to run the remainder of your code.

This approach works, but it's horrible for a few reasons, including the following:

  1. When you pass control back to Qt, your code is no longer running. This means that whatever long-running task you're trying to do will take longer. That is definitely not what you want.

  2. When you have multiple long-running tasks within your application, with each calling QApplication.processEvents() to keep things ticking, your application's behavior can be unpredictable.

  3. Processing events outside the main event loop (app.exec()) causes your application to branch off into handling code (e.g. for triggered slots or events) while within your loop. If your code depends on or responds to an external state, then this can cause undefined behavior.

The code below demonstrates the last point in action:

python
import time

from PySide6.QtCore import (
    QTimer,
)
from PySide6.QtWidgets import (
    QApplication,
    QLabel,
    QMainWindow,
    QPushButton,
    QVBoxLayout,
    QWidget,
)

class MainWindow(QMainWindow):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.counter = 0

        layout = QVBoxLayout()

        self.label = QLabel("Start")
        button = QPushButton("DANGER!")
        button.pressed.connect(self.oh_no)

        c = QPushButton("?")
        c.pressed.connect(self.change_message)

        layout.addWidget(self.label)
        layout.addWidget(button)
        layout.addWidget(c)

        w = QWidget()
        w.setLayout(layout)
        self.setCentralWidget(w)

        self.show()

        self.timer = QTimer()
        self.timer.setInterval(1000)
        self.timer.timeout.connect(self.recurring_timer)
        self.timer.start()

    def change_message(self):
        self.message = "OH NO"

    def oh_no(self):
        self.message = "Pressed"

        for n in range(100):
            time.sleep(0.1)
            self.label.setText(self.message)
            QApplication.processEvents()

    def recurring_timer(self):
        self.counter += 1
        self.label.setText(f"Counter: {self.counter}")

app = QApplication([])
window = MainWindow()
app.exec()

If you run this code you'll see the counter as before. Pressing DANGER! will change the displayed text to "Pressed", as defined at the entry point to the oh_no() method. However, if you press the "?" button while oh_no() is still running, you'll see that the message changes. The state is being changed from outside your event loop.

Use threads and processes

If you take a step back and think about what you want to happen in your application, then you can probably sum it up with "stuff to happen at the same time as other stuff happens".

There are two main approaches to running independent tasks within a PySide6 application:

  1. Threads
  2. Processes

Threads share the same memory space, so they are quick to start up and consume minimal resources. The shared memory makes it trivial to pass data between threads. However, reading or writing memory from different threads can lead to race conditions or segfaults.

In Python, there is the added issue that multiple threads are bound by the Global Interpreter Lock (GIL) — meaning non-GIL-releasing Python code can only execute in one thread at a time. However, this is not a major issue with PySide6, where most of the time is spent outside of Python.

Processes use separate memory space and an entirely separate Python interpreter. They sidestep any potential problems with Python's GIL but at the cost of slower start-up times, larger memory overhead, and complexity in sending and receiving data.

Processes in Qt are well suited to running and communicating with external programs. However, for simplicity's sake, threads are usually the best choice unless you have a good reason to use processes (see caveats later).

There is nothing stopping you from using pure Python threading or process-based approaches within your PySide6 application. In the following sections, though, you'll rely on Qt's threading classes.

QRunnable and the QThreadPool

Favor this approach in your code.

Qt provides a straightforward interface for running jobs or tasks in other threads, which is nicely supported in PySide6. This interface is built around two classes:

  1. QRunnable: The container for the work you want to perform.
  2. QThreadPool: The method by which you pass that work to alternate threads.

The neat thing about using QThreadPool is that it handles queuing and executing workers for you. Other than queuing up jobs and retrieving the results, there is not much to do.

To define a custom QRunnable, you can subclass the base QRunnable class. Then, place the code you wish you execute within the run() method. The following is an implementation of our long-running time.sleep() job as a QRunnable.

Go ahead and add the following code to multithread.py, above the MainWindow class definition, and don't forget to import QRunnable and Slot from PySide6.QtCore:

python
class Worker(QRunnable):
    """Worker thread."""

    @Slot()
    def run(self):
        """Your long-running job goes in this method."""
        print("Thread start")
        time.sleep(5)
        print("Thread complete")

Executing our long-running job in another thread is simply a matter of creating an instance of the Worker and passing it to our QThreadPool instance. It will be executed automatically.

Next, import QThreadPool from PySide6.QtCore and add the following code to the __init__() method to set up our thread pool:

python
self.threadpool = QThreadPool()
thread_count = self.threadpool.maxThreadCount()
print(f"Multithreading with maximum {thread_count} threads")

Finally, update the oh_no() method as follows:

python
def oh_no(self):
    worker = Worker()
    self.threadpool.start(worker)

Now, clicking the DANGER! button will create a worker to handle the (long-running) job and spin that off into another thread via thread pool. If there are not enough threads available to process incoming workers, they'll be queued and executed in order at a later time.

Try it out, and you'll see that your application now handles you bashing the button with no problems.

Check what happens if you hit the button multiple times. You should see your threads executed immediately up to the number reported by maxThreadCount(). If you press the button again after there are already this number of active workers, then the subsequent workers will be queued until a thread becomes available.

Improved QRunnable

If you want to pass custom data into the runner function, you can do so via __init__(), and then have access to the data via self from within the run() slot:

python
class Worker(QRunnable):
    """Worker thread.

    :param args: Arguments to make available to the run code
    :param kwargs: Keywords arguments to make available to the run code
    """

    def __init__(self, *args, **kwargs):
        super().__init__()
        self.args = args
        self.kwargs = kwargs

    @Slot()
    def run(self):
        """Initialise the runner function with passed self.args, self.kwargs."""
        print(self.args, self.kwargs)

We can take advantage of the fact that Python functions are objects and pass in the function to execute rather than subclassing QRunnable for each runner function. In the following construction, we only require a single Worker class to handle all of our jobs:

python
class Worker(QRunnable):
    """Worker thread.

    Inherits from QRunnable to handler worker thread setup, signals and wrap-up.

    :param callback: The function callback to run on this worker thread.
                     Supplied args and kwargs will be passed through to the runner.
    :type callback: function
    :param args: Arguments to pass to the callback function
    :param kwargs: Keywords to pass to the callback function
    """

    def __init__(self, fn, *args, **kwargs):
        super().__init__()
        self.fn = fn
        self.args = args
        self.kwargs = kwargs

    @Slot()
    def run(self):
        """Initialise the runner function with passed args, kwargs."""
        self.fn(*self.args, **self.kwargs)

You can now pass in any Python function and have it executed in a separate thread. Go ahead and update MainWindow with the following code:

python
def execute_this_fn(self):
    print("Hello!")

def oh_no(self):
    # Pass the function to execute
    worker = Worker(
        self.execute_this_fn
    )  # Any other args, kwargs are passed to the run function
    # Execute
    self.threadpool.start(worker)

Now, when you click DANGER!, the app will print Hello! to your terminal without affecting the counter.

Thread Input/Output

Sometimes, it's helpful to be able to pass back state and data from running workers. This could include the outcome of calculations, raised exceptions, or ongoing progress (maybe for progress bars). Qt provides the signals and slots framework to allow you to do just that. Qt's signals and slots are thread-safe, allowing safe communication directly from running threads to your GUI thread.

Signals allow you to emit values, which are then picked up elsewhere in your code by slot functions that have been linked with the connect() method.

Below is a custom WorkerSignals class defined to contain a number of example signals. Note that custom signals can only be defined on objects derived from QObject. Since QRunnable is not derived from QObject we can't define the signals there directly. A custom QObject to hold the signals is a quick solution:

python
class WorkerSignals(QObject):
    """Signals from a running worker thread.

    finished
        No data

    error
        tuple (exctype, value, traceback.format_exc())

    result
        object data returned from processing, anything
    """

    finished = Signal()
    error = Signal(tuple)
    result = Signal(object)

In this code, we've defined three custom signals:

  1. finished, which receives no data and is aimed to indicate when the task is complete.
  2. error, which receives a tuple of Exception type, Exception value, and formatted traceback.
  3. result, which receives any object type from the executed function.

You may not find a need for all of these signals, but they are included to give an indication of what is possible. In the following code, we're going to implement a long-running task that makes use of these signals to provide useful information to the user:

python
class Worker(QRunnable):
    """Worker thread.

    Inherits from QRunnable to handler worker thread setup, signals and wrap-up.

    :param callback: The function callback to run on this worker thread.
                     Supplied args and
                     kwargs will be passed through to the runner.
    :type callback: function
    :param args: Arguments to pass to the callback function
    :param kwargs: Keywords to pass to the callback function
    """

    def __init__(self, fn, *args, **kwargs):
        super().__init__()
        self.fn = fn
        self.args = args
        self.kwargs = kwargs
        self.signals = WorkerSignals()

    @Slot()
    def run(self):
        """Initialise the runner function with passed args, kwargs."""

        # Retrieve args/kwargs here; and fire processing using them
        try:
            result = self.fn(*self.args, **self.kwargs)
        except Exception:
            traceback.print_exc()
            exctype, value = sys.exc_info()[:2]
            self.signals.error.emit((exctype, value, traceback.format_exc()))
        else:
            self.signals.result.emit(result)  # Return the result of the processing
        finally:
            self.signals.finished.emit()  # Done

You can connect your own handler functions to the signals to receive notification of completion (or the result) of threads:

python
def execute_this_fn(self):
    for n in range(0, 5):
        time.sleep(1)
    return "Done."

def print_output(self, s):
    print(s)

def thread_complete(self):
    print("THREAD COMPLETE!")

def oh_no(self):
    # Pass the function to execute
    worker = Worker(
        self.execute_this_fn
    ) # Any other args, kwargs are passed to the run function
    worker.signals.result.connect(self.print_output)
    worker.signals.finished.connect(self.thread_complete)
    # Execute
    self.threadpool.start(worker)

You also often want to receive status information from long-running threads. This can be done by passing in callbacks to which your running code can send the information. You have two options here:

  1. Define new signals, allowing the handling to be performed using the event loop
  2. Use a regular Python function

In both cases, you'll need to pass these callbacks into your target function to be able to use them. The signal-based approach is used in the completed code below, where we pass a float back as an indicator of the thread's % progress.

The complete code

A complete working example is given below, showcasing the custom QRunnable worker together with the worker and progress signals. You should be able to easily adapt this code to any multithreaded application you develop:

python
import sys
import time
import traceback

from PySide6.QtCore import (
    QObject,
    QRunnable,
    QThreadPool,
    QTimer,
    Signal,
    Slot,
)
from PySide6.QtWidgets import (
    QApplication,
    QLabel,
    QMainWindow,
    QPushButton,
    QVBoxLayout,
    QWidget,
)

class WorkerSignals(QObject):
    """Signals from a running worker thread.

    finished
        No data

    error
        tuple (exctype, value, traceback.format_exc())

    result
        object data returned from processing, anything

    progress
        float indicating % progress
    """

    finished = Signal()
    error = Signal(tuple)
    result = Signal(object)
    progress = Signal(float)

class Worker(QRunnable):
    """Worker thread.

    Inherits from QRunnable to handler worker thread setup, signals and wrap-up.

    :param callback: The function callback to run on this worker thread.
                     Supplied args and
                     kwargs will be passed through to the runner.
    :type callback: function
    :param args: Arguments to pass to the callback function
    :param kwargs: Keywords to pass to the callback function
    """

    def __init__(self, fn, *args, **kwargs):
        super().__init__()
        self.fn = fn
        self.args = args
        self.kwargs = kwargs
        self.signals = WorkerSignals()
        # Add the callback to our kwargs
        self.kwargs["progress_callback"] = self.signals.progress

    @Slot()
    def run(self):
        try:
            result = self.fn(*self.args, **self.kwargs)
        except Exception:
            traceback.print_exc()
            exctype, value = sys.exc_info()[:2]
            self.signals.error.emit((exctype, value, traceback.format_exc()))
        else:
            self.signals.result.emit(result)
        finally:
            self.signals.finished.emit()

class MainWindow(QMainWindow):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.counter = 0

        layout = QVBoxLayout()

        self.label = QLabel("Start")
        button = QPushButton("DANGER!")
        button.pressed.connect(self.oh_no)

        layout.addWidget(self.label)
        layout.addWidget(button)

        w = QWidget()
        w.setLayout(layout)
        self.setCentralWidget(w)

        self.show()

        self.threadpool = QThreadPool()
        thread_count = self.threadpool.maxThreadCount()
        print(f"Multithreading with maximum {thread_count} threads")

        self.timer = QTimer()
        self.timer.setInterval(1000)
        self.timer.timeout.connect(self.recurring_timer)
        self.timer.start()

    def progress_fn(self, n):
        print(f"{n:.1f}% done")

    def execute_this_fn(self, progress_callback):
        for n in range(0, 5):
            time.sleep(1)
            progress_callback.emit(n * 100 / 4)

        return "Done."

    def print_output(self, s):
        print(s)

    def thread_complete(self):
        print("THREAD COMPLETE!")

    def oh_no(self):
        # Pass the function to execute
        worker = Worker(
            self.execute_this_fn
        )  # Any other args, kwargs are passed to the run function
        worker.signals.result.connect(self.print_output)
        worker.signals.finished.connect(self.thread_complete)
        worker.signals.progress.connect(self.progress_fn)
        # Execute
        self.threadpool.start(worker)

    def recurring_timer(self):
        self.counter += 1
        self.label.setText(f"Counter: {self.counter}")

app = QApplication([])
window = MainWindow()
app.exec()

Caveats

You may have spotted a slight flaw in this master plan—we are still using the event loop (and the GUI thread) to process our workers' output.

This isn't a problem when we're simply tracking progress, completion, or returning metadata. However, if you have workers that return large amounts of data — e.g. loading large files, performing complex analysis and needing (large) results, or querying databases — passing this data back through the GUI thread may cause performance problems and is best avoided.

Similarly, if your application uses a large number of threads and Python result handlers, you may come up against the limitations of the GIL. As mentioned previously, when using threads execution of Python code is limited to a single thread at a time. The Python code that handles signals from your threads can be blocked by your workers and the other way around. Since blocking your slot functions blocks the event loop, this can directly impact GUI responsiveness.

In these cases, it is often better to investigate using a pure Python thread pool (e.g. concurrent futures) to keep your processing and thread-event handling further isolated from your GUI. However, note that any Python GUI code can block other Python code unless it's in a separate process.

March 26, 2025 06:00 AM UTC

March 25, 2025


PyCoder’s Weekly

Issue #674: LangGraph, Marimo, Django Template Components, and More (March 25, 2025)

#674 – MARCH 25, 2025
View in Browser »

The PyCoder’s Weekly Logo


LangGraph: Build Stateful AI Agents in Python

LangGraph is a versatile Python library designed for stateful, cyclic, and multi-actor Large Language Model (LLM) applications. This tutorial will give you an overview of LangGraph fundamentals through hands-on examples, and the tools needed to build your own LLM workflows and agents in LangGraph.
REAL PYTHON

Quiz: LangGraph: Build Stateful AI Agents in Python

REAL PYTHON

Reinventing Notebooks as Reusable Python Programs

Marimo is a Jupyter replacement that uses Python as its source instead of JSON, solving a lot of issues with notebooks. This article shows you why you might switch to marimo.
AKSHAY, MYLES, & MADISETTI

How to Build AI Agents With Python & Temporal

alt

Join us on April 3 at 9am PST/12pm EST to learn how Temporal’s Python SDK powers an agentic AI workflow creation. We’ll start by covering how Temporal lets you orchestrate agentic AI, then transition to a live demo →
TEMPORAL sponsor

Django Template Components Are Slowly Coming

Django 5.2 brings the Simple Block tag which is very similar to React children, allowing templated components. This post shows several examples from Andy’s own code.
ANDREW MILLER

PEP 758: Allow Except and Except* Expressions Without Parentheses (Accepted)

PYTHON.ORG

IPython 9 Released

IPYTHON.READTHEDOCS.IO

Python Release Python 3.14.0a6

PYTHON.ORG

Django 5.2 Release Candidate 1 Released

DJANGO SOFTWARE FOUNDATION

Quiz: Build an LLM RAG Chatbot With LangChain

REAL PYTHON

Articles & Tutorials

A Decade of Automating the Boring Stuff With Python

What goes into updating one of the most popular books about working with Python? After a decade of changes in the Python landscape, what projects, libraries, and skills are relevant to an office worker? This week on the show, we speak with previous guest Al Sweigart about the third edition of “Automate the Boring Stuff With Python.”
REAL PYTHON podcast

PyCon US: Travel Grants & Refund Policy

PyCon US offers travel grants to visitors. This post explains how they are decided. Also, with changing border requirements in the US, you may also be interested in the Refund Policy for International Attendees
PYCON.BLOGSPOT.COM

Using Structural Pattern Matching in Python

In this video course, you’ll learn how to harness the power of structural pattern matching in Python. You’ll explore the new syntax, delve into various pattern types, and find appropriate applications for pattern matching, all while identifying common pitfalls.
REAL PYTHON course

Smoke Test Your Django Admin Site

When writing code that uses the Django Admin, sometimes you forget to match things up. Since it is the Admin, who tests that? That doesn’t mean it won’t fail. This post shows you a general pytest function for checking that empty Admin pages work correctly.
JUSTIN DUKE

Python’s Instance, Class, and Static Methods Demystified

In this tutorial, you’ll compare Python’s instance methods, class methods, and static methods. You’ll gain an understanding of when and how to use each method type to write clear and maintainable object-oriented code.
REAL PYTHON

I Fear for the Unauthenticated Web

A short opinion post by Seth commenting on how companies scraping the web to build LLMs are causing real costs to users, and suggests you implement billing limits on your services.
SETH M LARSON

Django Query Optimization: Defer, Only, and Exclude

Database queries are usually the bottlenecks of most web apps. To minimize the amount of data fetched, you can leverage Django’s defer(), only(), and exclude() methods.
TESTDRIVEN.IO • Shared by Michael Herman

How to Use Async Agnostic Decorators in Python

Using decorators in a codebase that has both synchronous and asynchronous functions poses many challenges. One solution is to use generators. This post shows you how.
PATREON • Shared by Patreon Engineering

PEP 779: Criteria for Supported Status for Free-Threaded Python

PEP 703 (Making the Global Interpreter Lock Optional in CPython), described three phases of development. This PEP outlines the criteria to move between phases.
PYTHON.ORG

uv overtakes Poetry

Wagtail, the Django-based CMS, tracks download statistics including by which installation tool. Recently, uv overtook Poetry. This post shows the stats.
THIBAUD COLAS

Using Pyinstrument to Profile FastHTML Apps

A quick post with instructions on how to add profiling to your FastHTML app with pyinstrument.
DANIEL ROY GREENFIELD

Projects & Code

compress_json: Read and Write Compressed JSON

GITHUB.COM/LUCACAPPELLETTI94

pysqlscribe: A SQL Query Builder in Python

GITHUB.COM/DANIELENRICOCAHALL

shorts_maker: YouTube Shorts Automation

GITHUB.COM/RAJATHJN

pydoll: Automate Chromium Browsers Without a WebDriver

GITHUB.COM/THALISSONVS

faststream: Event Streams Library

GITHUB.COM/AIRTAI

Events

Weekly Real Python Office Hours Q&A (Virtual)

March 26, 2025
REALPYTHON.COM

SPb Python Drinkup

March 27, 2025
MEETUP.COM

Python Leiden User Group

March 27, 2025
PYTHONLEIDEN.NL

PyLadies Amsterdam: Introduction to BDD in Python

March 31, 2025
MEETUP.COM


Happy Pythoning!
This was PyCoder’s Weekly Issue #674.
View in Browser »

alt

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

March 25, 2025 07:30 PM UTC


TechBeamers Python

10 Viral Tips to Learn Python Instantly 🚀

Python is one of the most in-demand programming languages in 2025, powering AI, web development, automation, and more. Whether you’re a complete beginner or looking to sharpen your skills, you don’t need months to get started. If you want to learn Python instantly, these 10 viral, fast-track methods will help you grasp the fundamentals quickly […]

Source

March 25, 2025 05:30 PM UTC


PyCon

Call for Volunteers: PyCon US Code of Conduct Team

Help us make PyCon US welcoming, fun, and safe!

We are looking for volunteers to join the Code of Conduct Team for PyCon US 2025 in Pittsburgh, PA. The Code of Conduct Team supports the PyCon US community by taking reports should anyone violate the PyCon US Code of Conduct and, when appropriate, participating in deciding how PyCon US should respond.


Code of Conduct Team shifts are 3-4 hours long. We are looking for volunteers for the tutorials (May 14 - 15), the main conference (May 16 - 18), and the first 2 days of sprints (May 19 - 20), and ask that you be prepared to do a minimum of two shifts.


As a member of the Code of Conduct Team, you will:

  • Take reports of incidents that occur at PyCon US (there’s a handy form for this)

  • Keep track of your email and/or Slack while on shift

  • Participate in discussions about how to respond to incidents (as needed)

As a member of the Code of Conduct Team, you will not:

  • Have to make any tough decisions on your own

  • Have to approach anyone you are uncomfortable approaching

  • Have to address incidents that involve your friends

In addition to the minimum two shifts, you will need to attend a 90-minute training session virtually before PyCon US or in-person on May 14th or 16th.


We are looking for people with a wide range of experiences in and outside of the Python community! We are especially interested in talking with you if you:

  • Have experience with code of conduct response, moderation, etc

  • Speak multiple languages (especially Spanish)

  • Are from outside the United States

If you are interested in joining us on the Code of Conduct Team, either fill out this form or email molly.deblanc@pyfound.org.

March 25, 2025 03:18 PM UTC


Real Python

What Can You Do With Python?

You’ve finished a course or finally made it to the end of a book that teaches you the basics of programming with Python. You’ve learned about variables, lists, tuples, dictionaries, for and while loops, conditional statements, object-oriented concepts, and more. So, what’s next? What can you do with Python nowadays?

Python is a versatile programming language with many use cases in a variety of different fields. If you’ve grasped the basics of Python and are itching to build something with the language, then it’s time to figure out what your next step should be.

In this video course, you’ll see how you can use Python for:


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

March 25, 2025 02:00 PM UTC

Quiz: GitHub Actions for Python

In this quiz, you’ll test your understanding of Continuous Integration and Deployment for Python With GitHub Actions.

By working through this quiz, you’ll revisit how to use GitHub Actions and workflows to automate linting, testing, and deployment of a Python project. You’ll also review how to secure your credentials and automate security and dependency updates.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

March 25, 2025 12:00 PM UTC

Quiz: Python Set Comprehensions: How and When to Use Them

In this quiz, you’ll test your understanding of Python Set Comprehensions: How and When to Use Them.

Set comprehensions are a concise and quick way to create, transform, and filter sets in Python. They can significantly enhance your code’s conciseness and readability compared to using regular for loops to process your sets.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

March 25, 2025 12:00 PM UTC

Quiz: Python's Instance, Class, and Static Methods Demystified

In this quiz, you’ll test your understanding of Instance, Class, and Static Methods in Python. By working through this quiz, you’ll revisit the differences between these methods and how to use them effectively in your Python code.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

March 25, 2025 12:00 PM UTC

Quiz: Dictionaries in Python

In this quiz, you’ll test your understanding of Dictionaries in Python.

By working through this quiz, you’ll revisit how to create dictionaries using literals and the dict() constructor, how to use Python’s operators and built-in functions to manipulate them, and how they’re implemented as a hash map for fast key lookups.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

March 25, 2025 12:00 PM UTC


Hugo van Kemenade

Free-threaded Python on GitHub Actions

GitHub Actions now supports experimental free-threaded CPython!

There are three ways to add it to your test matrix:

actions/setup-python: t suffix #

Using actions/setup-python, you can add the t suffix for Python versions 3.13 and higher: 3.13t and 3.14t.

This is my preferred method, we can clearly see which versions are free-threaded and it’s straightforward to test both regular and free-threaded builds.

on: [push, pull_request, workflow_dispatch]

jobs:
 build:
 runs-on: ${{ matrix.os }}
 strategy:
 fail-fast: false
 matrix:
 python-version: [
 "3.13",
 "3.13t", # add this!
 "3.14",
 "3.14t", # add this!
 ]
 os: ["windows-latest", "macos-latest", "ubuntu-latest"]

 steps:
 - uses: actions/checkout@v4

 - name: Set up Python ${{ matrix.python-version }}
 uses: actions/setup-python@v5
 with:
 python-version: ${{ matrix.python-version }}
 allow-prereleases: true # needed for 3.14

 - run: |
 python --version --version
 python -c "import sys; print('sys._is_gil_enabled:', sys._is_gil_enabled())"
 python -c "import sysconfig; print('Py_GIL_DISABLED:', sysconfig.get_config_var('Py_GIL_DISABLED'))"

Regular builds will output something like:

Python 3.14.0a6 (main, Mar 17 2025, 02:44:29) [GCC 13.3.0]
sys._is_gil_enabled: True
Py_GIL_DISABLED: 0

And free-threaded builds will output something like:

Python 3.14.0a6 experimental free-threading build (main, Mar 17 2025, 02:44:30) [GCC 13.3.0]
sys._is_gil_enabled: False
Py_GIL_DISABLED: 1

For example: hugovk/test/actions/runs/14057185035

actions/setup-uv: t suffix #

Similarly, you can install uv with astral/setup-uv and use that to set up free-threaded Python using the t suffix.

on: [push, pull_request, workflow_dispatch]

jobs:
 build:
 runs-on: ${{ matrix.os }}
 strategy:
 fail-fast: false
 matrix:
 python-version: [
 "3.13",
 "3.13t", # add this!
 "3.14",
 "3.14t", # add this!
 ]
 os: ["windows-latest", "macos-latest", "ubuntu-latest"]

 steps:
 - uses: actions/checkout@v4

 - name: Set up Python ${{ matrix.python-version }}
 uses: astral-sh/setup-uv@v5 # change this!
 with:
 python-version: ${{ matrix.python-version }}
 enable-cache: false # only needed for this example with no dependencies

 - run: |
 python --version --version
 python -c "import sys; print('sys._is_gil_enabled:', sys._is_gil_enabled())"
 python -c "import sysconfig; print('Py_GIL_DISABLED:', sysconfig.get_config_var('Py_GIL_DISABLED'))"

For example: hugovk/test/actions/runs/13967959519

actions/setup-python: freethreaded variable #

Back to actions/setup-python, you can also set the freethreaded variable for 3.13 and higher.

on: [push, pull_request, workflow_dispatch]

jobs:
 build:
 runs-on: ${{ matrix.os }}
 strategy:
 fail-fast: false
 matrix:
 python-version: ["3.13", "3.14"]
 os: ["windows-latest", "macos-latest", "ubuntu-latest"]

 steps:
 - uses: actions/checkout@v4

 - name: Set up Python ${{ matrix.python-version }}
 uses: actions/setup-python@v5
 with:
 python-version: ${{ matrix.python-version }}
 allow-prereleases: true # needed for 3.14
 freethreaded: true # add this!

 - run: |
 python --version --version
 python -c "import sys; print('sys._is_gil_enabled:', sys._is_gil_enabled())"
 python -c "import sysconfig; print('Py_GIL_DISABLED:', sysconfig.get_config_var('Py_GIL_DISABLED'))"

For example: hugovk/test/actions/runs/39359291708

PYTHON_GIL=0 #

And you may want to set PYTHON_GIL=0 to force Python to keep the GIL disabled, even after importing a module that doesn’t support running without it.

See Running Python with the GIL Disabled for more info.

With the t suffix:

- name: Set PYTHON_GIL
 if: endsWith(matrix.python-version, 't')
 run: |
 echo "PYTHON_GIL=0" >> "$GITHUB_ENV"

With the freethreaded variable:

- name: Set PYTHON_GIL
 if: "${{ matrix.freethreaded }}"
 run: |
 echo "PYTHON_GIL=0" >> "$GITHUB_ENV"

Please test! #

For free-threaded Python to succeed and become the default, it’s essential there is ecosystem and community support. Library maintainers: please test it and where needed, adapt your code, and publish free-threaded wheels so others can test their code that depends on yours. Everyone else: please test your code too!

See also #


Header photo: “Spinning Room, Winding Bobbins with Woolen Yarn for Weaving, Philadelphia, PA” by Library Company of Philadelphia, with no known copyright restrictions.

March 25, 2025 10:25 AM UTC


Seth Michael Larson

Don't bring slop to a slop fight

Whenever I talk about generative AI slop being sent into every conceivable communication platform I see a common suggestion on how to stop the slop from reaching human eyes:

“Just use AI to detect the AI”

We're already seeing companies offer this arrangement as a service. Just a few days ago Cloudflare announced they would use generative AI to create an infinite "labyrinth" for trapping AI crawlers in pages of content and links.

This suggestion is flawed because doing so props up the real problem: generative AI is heavily subsidized. In reality generative AI is so expensive we're talking about restarting nuclear and coal power plants and reopening copper mines, people. There is no universe that this service should allow users to run queries without even a credit card on file.

Today this subsidization is mostly done by venture capital who want to see the technology integrated into as many verticals as possible. The same strategy was used for Uber and WeWork where venture capital allowed those companies to undercut competition to have wider adoption and put competitors out of business.

So using AI to detect and filter AI content just means that there'll be even more generative AI in use, not less. This isn't the signal we want to send to the venture capitalists who are deciding whether to offer these companies more investment money. We want that "monthly active user" (MAU) graph to be flattening or decreasing.

We got a sneak peek at the real price of generative AI from OpenAI where a future top-tier model (as of March 5th, 2025) is supposedly going to be $20,000 USD per month.

That's sounds more like it. The sooner we get to unsubsidized generative AI pricing the better we'll all be, including the planet. So let's hold out for that future and think asymmetrically, not symmetrically, on methods to make generative AI slop not viable until we get there.

March 25, 2025 12:00 AM UTC

March 24, 2025


Real Python

Python Code Quality: Best Practices and Tools

Producing high-quality Python code involves using appropriate tools and consistently applying best practices. High-quality code is functional, readable, maintainable, efficient, and secure. It adheres to established standards and has excellent documentation.

You can achieve these qualities by following best practices such as descriptive naming, consistent coding style, modular design, and robust error handling. To help you with all this, you can use tools such as linters, formatters, and profilers.

By the end of this tutorial, you’ll understand that:

  • Checking the quality of Python code involves using tools like linters and static type checkers to ensure adherence to coding standards and detect potential errors.
  • Writing quality code in Python requires following best practices, such as clear naming conventions, modular design, and comprehensive testing.
  • Good Python code is characterized by readability, maintainability, efficiency, and adherence to standards like PEP 8.
  • Making Python code look good involves using formatters to ensure consistent styling and readability, aligning with established coding styles.
  • Making Python code readable means using descriptive names for variables, functions, classes, modules, and packages.

Read on to learn more about the strategies, tools, and best practices that will help you write high-quality Python code.

Get Your Code: Click here to download the free sample code that you’ll use to learn about Python code quality best practices and tools.

Take the Quiz: Test your knowledge with our interactive “Python Code Quality: Best Practices and Tools” quiz. You’ll receive a score upon completion to help you track your learning progress:


Interactive Quiz

Python Code Quality: Best Practices and Tools

In this quiz, you'll test your understanding of Python code quality, tools, and best practices. By working through this quiz, you'll revisit the importance of producing high-quality Python code that's functional, readable, maintainable, efficient, and secure.

Defining Code Quality

Of course you want quality code. Who wouldn’t? But what is code quality? It turns out that the term can mean different things to different people.

One way to approach code quality is to look at the two ends of the quality spectrum:

  • Low-quality code: It has the minimal required characteristics to be functional.
  • High-quality code: It has all the necessary characteristics that make it work reliably, efficiently, and effectively, while also being straightforward to maintain.

In the following sections, you’ll learn about these two quality classifications and their defining characteristics in more detail.

Low-Quality Code

Low-quality code typically has only the minimal required characteristics to be functional. It may not be elegant, efficient, or easy to maintain, but at the very least, it meets the following basic criteria:

  • It does what it’s supposed to do. If the code doesn’t meet its requirements, then it isn’t quality code. You build software to perform a task. If it fails to do so, then it can’t be considered quality code.
  • It doesn’t contain critical errors. If the code has issues and errors or causes you problems, then you probably wouldn’t call it quality code. If it’s too low-quality and becomes unusable, then if falls below even basic quality standards and you may stop using it altogether.

While simplistic, these two characteristics are generally accepted as the baseline of functional but low-quality code. Low-quality code may work, but it often lacks readability, maintainability, and efficiency, making it difficult to scale or improve.

High-Quality Code

Now, here’s an extended list of the key characteristics that define high-quality code:

  • Functionality: Works as expected and fulfills its intended purpose.
  • Readability: Is easy for humans to understand.
  • Documentation: Clearly explains its purpose and usage.
  • Standards Compliance: Adheres to conventions and guidelines, such as PEP 8.
  • Reusability: Can be used in different contexts without modification.
  • Maintainability: Allows for modifications and extensions without introducing bugs.
  • Robustness: Handles errors and unexpected inputs effectively.
  • Testability: Can be easily verified for correctness.
  • Efficiency: Optimizes time and resource usage.
  • Scalability: Handles increased data loads or complexity without degradation.
  • Security: Protects against vulnerabilities and malicious inputs.

In short, high-quality code is functional, readable, maintainable, and robust. It follows best practices, including clear naming, consistent coding style, modular design, proper error handling, and adherence to coding standards. It’s also well-documented and easy to test and scale. Finally, high-quality code is efficient and secure, ensuring reliability and safe use.

All the characteristics above allow developers to understand, modify, and extend a Python codebase with minimal effort.

The Importance of Code Quality

To understand why code quality matters, you’ll revisit the characteristics of high-quality code from the previous section and examine their impact:

  • Functional code: Ensures correct behavior and expected outcomes.
  • Readable code: Makes understanding and maintaining code easier.
  • Documented code: Clarifies the correct and recommended way for others to use it.
  • Compliant code: Promotes consistency and allows collaboration.
  • Reusable code: Saves time by allowing code reuse.
  • Maintainable code: Supports updates, improvements, and extensions with ease.
  • Robust code: Minimizes crashes and produces fewer edge-case issues.
  • Testable code: Simplifies verification of correctness through code testing.
  • Efficient code: Runs faster and conserves system resources.
  • Scalable code: Supports growing projects and increasing data loads.
  • Secure code: Provides safeguards against system loopholes and compromised inputs.

The quality of your code matters because it produces code that’s easier to understand, modify, and extend over time. It leads to faster debugging, smoother feature development, reduced costs, and better user satisfaction while ensuring security and scalability.

Read the full article at https://realpython.com/python-code-quality/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

March 24, 2025 02:00 PM UTC

Quiz: Python Code Quality: Best Practices and Tools

In this quiz, you’ll test your understanding of Python Code Quality: Tools & Best Practices.

By working through this quiz, you’ll revisit the importance of producing high-quality Python code that’s functional, readable, maintainable, efficient, and secure. You’ll also review how to use tools such as linters, formatters, and profilers to help achieve these qualities.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

March 24, 2025 12:00 PM UTC