Bulk editing posts for Jekyll

This website is generated by Jekyll, a static site generator. Static sites do not require complex backend engines, such as databases and software which recreate site content for each visitor. It’s just static files. This translates into a) free hosting (it’s cheap to host static files), b) super-fast websites (no software grinding behind) and c) less energy consumption. But it also implies the maintainer doing much of the work that is normally delegated to machines.

Jekyll is for geeks, even if you may read otherwise on the net, no buttons and check boxes here. Managing the content can easily become a pain, especially when we start to deal with larger numbers of articles. These are all just text files sitting on the disk. A typical problem is to remove or to add tags to existing articles. As sites evolve, we introduce new tags and reorganise the old ones.

There is a solution for the management of Jekyll posts in the form of a small Python library called frontmatter. To install just do:

pip install python-frontmatter

This library can edit data fields in post heading (YAML format) and read post contents. That’s perfect for our problem of bulk editing tag fields, we just need to loop through posts and assign/remove tags. This is a script I came up with:

import frontmatter
from os import listdir
from os.path import isfile, join

import codecs #this is nasty...

"""
Bulk edit function for tags in Jekyll posts. 

The script will attempt to edit and OVERWRITE ALL FILES 
in the specified folder: 
you should place selected files in a temporary folder. 
"""

# -------  INPUT PARAMETERS --------------

# TEMPORARY folder with selected files 
# (do not specify the main _posts folder!)
folder = '__path__to__my__temporary__folder__'

add_tag = 'qgis' #tag to be insereted

remove_tag= '' #tag to be removed

# ----------- ENGINE --------------------

for f in listdir(folder):

    file = join (folder, f)

    if not isfile(file): continue

    print (file)

    post = frontmatter.load (file)

    #print (post['title'])
	
    if remove_tag :
        try:  post['tags'].remove(remove_tag)
        except: pass

    if add_tag:
        try:
            if add_tag not in post['tags'] :
                post['tags'] += [add_tag]
        except:
            # tags do not exist yet
            post.metadata['tags'] = [add_tag]

  
	# this is the output in text format
    out = frontmatter.dumps(post)

    # https://stackoverflow.com/questions/934160/write-to-utf-8-file-in-python/934203#934203
    o = codecs.open(file, 'w', 'utf-8')
    o.write(out)
    o.close()

IMPORTANT: the script will attempt to edit all files in a folder, there is no particular filter or whatever. You should never run it inside the main folder containing original posts. Copy selected posts to a temporary folder, edit, check and then paste back to the main _posts folder (or wherever these may be placed).

Note that data fields are represented as keys in a dictionary, which is handy ( print (post['title']) ). This script can be extended for other data fields (date, author name, title etc…).