Quick Tip: Controlling Windows with Python

    Stuart Langridge
    Share

    In this quick tip, excerpted from Useful Python, Stuart looks at ways to control the Windows OS with Python.

    The Windows Registry

    Windows is entirely controllable from code using the Win32 API, and Microsoft provides extensive documentation at Microsoft Docs for everything that Windows can programmatically do. All of this is accessible from Python as well, although it can seem a little impenetrable if we’re not already accustomed to the Win32 API’s particular way of working. Fortunately, there are various wrappers for these low-level APIs to make code easier to write for Python programmers.

    A simple example is to interact with the Windows Registry. Python actually includes the winreg module for doing this out of the box, so no extra installation is required. For an example, let’s check where the Program Files folder actually lives:

    >>> import winreg
    >>> hive = winreg.ConnectRegistry(None, winreg.HKEY_LOCAL_MACHINE)
    >>> key = winreg.OpenKey(hive, r"SOFTWARE\Microsoft\Windows\CurrentVersion")
    >>> value, type = winreg.QueryValueEx(key, "ProgramFilesDir")
    >>> value
    'C:\\Program Files'
    

    Raw Strings

    In the code above, we’re using “raw strings” to specify the key name:

    r"SOFTWARE\Microsoft\Windows\CurrentVersion"
    

    Strings passed to the Win32 API often include the backslash character (\), because Windows uses it in file paths and registry paths to divide one directory from the next.

    However, Python uses a backslash as an escape character to allow adding special, untypeable characters to a string. For example, the Python string "first line\nsecond line" is a string with a newline character in it, so that the text is spread over two lines. This would conflict with the Windows path character: a file path such as "C:\newdir\myfile.txt" would have the \n interpreted as a newline.

    Raw strings avert this: prefixing a Python string with r removes the special meaning of a backslash, so that r"C:\newdir\myfile.txt" is interpreted as intended. We can see that backslashes are treated specially by the value we get back for the folder location: it’s printed as 'C:\\Program Files'—with the backslash doubled to remove its special meaning—but this is how Python prints it rather than the actual value. Python could have printed that as r'C:\Program Files' instead.

    The Windows API

    Reading the registry (and even more so, writing to it) is the source of a thousand hacks on web pages (many of which are old, shouldn’t be linked to, and use the ancient REGEDT32.EXE), but it’s better to actually use the API for this. (Raymond Chen has written many long sad stories about why we should use the API and not the registry.) How would we use the Win32 API from Python to work this out?

    The Win32 Python API is available in the PyWin32 module, which can be obtained with python -m pip install pywin32. The documentation for the module is rather sparse, but the core idea is that most of the Windows Shell API (that’s concerned with how the Windows OS is set up) is available in the win32com.shell package. To find out the location of the Program Files folder, MSDN shows that we need the SHGetKnownFolderPath function, to which is passed a KNOWNFOLDERID constant and a flag set to 0. Shell constants are available to Python in win32com.shell.shellcon (for “shell constants”), which means that finding the Program Files folder requires just one (admittedly complex) line:

    >>> from win32com.shell import shell, shellcon
    >>> shell.SHGetKnownFolderPath(shellcon.FOLDERID_ProgramFiles, 0)
    "C:\\Program Files"
    

    Digging around in the depths of the Win32 API gives us access to anything we may want to access in Windows (including windows!), but as we’ve seen, it can be quite complicated to find out how to do what we need to, and then to translate that need into Python. Fortunately, there are wrapper libraries for many of the functions commonly used. One good example is PyGetWindow, which allows us to enumerate and control on-screen windows. (It claims to be cross-platform, but it actually only works on Windows. But that’s all we need here.)

    We can install PyGetWindow with python -m pip install pygetwindow, and then list all the windows on screen and manipulate them:

    >>> import pygetwindow as gw
    >>> allMSEdgeWindows = gw.getWindowsWithTitle("edge")
    >>> allMSEdgeWindows
    [Win32Window(hWnd=197414), Win32Window(hWnd=524986)]
    >>> allMSEdgeWindows[0].title
    'pywin32 · PyPI - Microsoft Edge'
    >>> allMSEdgeWindows[1].title
    'Welcome to Python.org - Microsoft Edge'
    

    Those windows can be controlled. A window object can be minimized and restored, or resized and moved around the screen, and focused and brought to the front:

    >>> pythonEdgeWindow = allMSEdgeWindows[1]
    >>> pythonEdgeWindow.minimize()
    >>> pythonEdgeWindow.restore()
    >>> pythonEdgeWindow.size
    Size(width=1050, height=708)
    >>> pythonEdgeWindow.topleft
    Point(x=218, y=5)
    >>> pythonEdgeWindow.resizeTo(800, 600)
    

    It’s always worth looking on PyPI for wrapper modules that provide a more convenient API for whatever we’re trying to do with windows or with Windows. But if need be, we have access to the whole Win32 API from Python, and that will let us do anything we can think of.

    This article is excerpted from Useful Python, available on SitePoint Premium and from ebook retailers.