The public sector could be better at managing knowledge ‘data’: what can we do?

Who thinks the public sector is good enough at managing its stock of knowledge; the ideas, strategies, processes, and decisions that go into the efficient provision of public goods and services? Not many, I’d wager. Which is odd, given the reputation for bureaucracy! In this post, I look at how good knowledge management could make public sector organisations more efficient and how that change might be effected–at least in the case of knowledge that is digitally recorded (aka knowledge data).

There are lots of reasons for poor knowledge management. The churn rate of staff in the public sector is high. This is anecdotal but it seems like the holder of most public sector jobs in the UK turns over every 18–24 months or so (tenures may be longer outside central government), taking with them a huge amount of knowledge about how to do the job. And what knowledge has been built-up in that time is very rarely transmitted by a handover note. There are also frequent enough re-organisations that the role may change, meaning a new role is a combination of several previous ones that, ideally, a newly hired member of staff would learn about by drawing on the experiences of multiple former staff.

Even if everyone writes down every little piece of information about their role, it can be hard for subsequent staff doing similar jobs to actually find that information. Some public sector organisations have no way of storing the stock of knowledge as data–they work entirely on flow, with emails carrying files. Others use a shared file system (aka a network drive) to store documents and, usually, it’s hard to properly search these for relevant documents–anyone who has used Window’s file search function over a network will know exactly what I mean. If you’re lucky enough to have a solution in place, that solution may be very limited too: a few organisations use Microsoft’s Sharepoint, but the filtering and search options are byzantine.

Perhaps most worryingly of all, there just isn’t always the bandwidth or culture behind good record managment. The public sector organisations of many countries have been under pressure to do more with less for a long time, and it’s very easy for “flow” to crowd out “stock”: that is, keeping good records and managing the stock of knowledge suffers because everyone is fighting the latest crisis or otherwise putting out 1001 small fires. Culturally, meetings, which are by their very nature ephemeral, are the primary unit of decision-making, idea discussion, and strategy making.

This is not to say that public sector organisations do not come up with lengthy strategies–there are numerous examples of those. But they tend to be outward looking and paint on a wide canvas. It’s the smaller, internal workings and ideas that don’t get recorded sufficiently to be later searchable and indeed (re-)usable.

Why not managing the stock of knowledge data makes the public sector less effective

There’s always a balance, but failing to manage the stock of knowledge data well is likely to lead to organisations being far less effective. To steal a phrase from history, those who fail to learn from the past are doomed to repeat its mistakes.

The problems facing public sector organisations that are structural are not easily remedied. Someone arriving in a new role might wonder why such and such a thing has not been tried, and plunge into trying out solutions. But, because of the stubbornness of the problem and the lack of record-keeping, it’s extremely likely that similar solutions have been tried before–so, at best, the new person isn’t able to build on where their predeccesors got to and, at worst, what they are pursuing is a complete waste of time.

It’s not just about steering away from what’s been tried (and has failed) before though; by having an easily searchable record of what was thought, reasoned, and decided, the possibility that someone can come along and synthesise a better solution is greatly raised. And even for the times when everyone has agreed a way forward, a new hire who can easily see what has gone before is going to be more effective more quickly. No-one should be having to start from scratch.

You might think that it would be unusual for someone to start from scratch. It isn’t. There’s a great story I’ve heard about a Civil Servant who had spent many years in a single large and important department. Every time there was a sudden desire for a policy that did this or that, instead of working up something new, he would simply walk over to his filing cabinet, flick through to the right section, and pull out all of the documents detailing the last time the policy had been discussed or even tried. We don’t use filing cabinets anymore, but we do still benefit when we can avoid repeating effort, so we must create digital filing cabinets from which we can pull out ideas whose time is right.

I also believe that simply knowing that every word you put down in a note is going to be searchable and available for posterity will encourage clarity of thought too. Any writing that’s going to a wider audience forces you to think more about how it will be read, and what to make clear. It may make you question whether what you’re doing is even the right priority.

There’s another nice efficiency that can be had from seeing the sweep of ideas that, say, an individual has laid out before: lancing BS. I’m sorry to say it, but, in large organisations, you do sometimes come across those who talk a lot of gibberish-filled nonsense while also delivering very little and, worst of all, wasting everyone else’s time. The vast majority of people are not like this, but there are some. They obfuscate and complicate, slowing down delivery. You have to work so hard to understand them, most sane people give up and move on–which is also how these particular individuals get away with it. Because we’re all talking in jargon a lot of the time, these tactics can be hard to spot straight away–but being able to search through the recorded content that a person has created is a quick way to find out if they are simply a convincing waffle-generator or someone trying to make progress happen.

Finally, there are often times when the wider public need to peek into the internal workings of the state, to understand how a particular decision was taken (or not taken), who was involved, and whether the risks were known. The UK’s Coronavirus Enquiry is a good example. Decent record keeping can be a huge boon for such public scrutiny when it happens–everyone will get clearer answers faster. (Of course, in the UK, constructive ambiguity has long been used to help fudge a way into a good outcome but I have no fear that this can continue whether good records are kept and are searchable or not.) The public will rightly expect that good records are being kept.

What can we do about it?

Changing culture is always about leadership and setting norms–that one is obvious. And creating the bandwidth for record-keeping may also be partly cultural. We probably all need to argue that the benefits outweigh the costs too (if we agree that they do). You could write entire blog posts about these two issues.

But I’m going to focus on the challenges around knowledge data that can be solved technology and infrastructure.

So, here’s what I think might help.

Make sure the flow of information is good

“Garbage in, garbage out,” goes the saying. If we’re not putting the right records in, we’re unlikely to benefit from them when we look at them again later.

Overall, I’m less worried about this because (as noted right at the start), large organisations do tend to be bureaucratic and are in general good at creating boards, taking minutes, and having all of the other accoutrements that come with a good secretariat. But it’s important that this infrastructure does exist everywhere that a large group (say a board) are coming together to hear information and make decisions. Papers presented should be informative and have clear recommendations, and of course any decisions and salient comments should be recorded.

Clear records can also help show which meetings are effective and which are not worth the time they take.

“Notes” should be the default, with (Microsoft Powerpoint) presentations only by exception

Notes–usually documents of up to 6 pages that can have figures in but are written in full sentences–should be the default way to capture ideas, strategies, processes, and decisions.

At the risk of over-generalising, presentations tend to hide woolly thinking. Writing notes in sentences and paragraphs (bullets allowed) forces more clarity. Of course one can obfuscate in prose too, but it’s harder to hide that obfuscation behind full sentences than it is in a shiny presentation (or, let’s be honest, a bad presentation, which most of them are). I know of one large, important public sector organisation in particular that is absolutely addicted to Microsoft Powerpoint slides–but fills them to the brim with text.

Sometimes elected policymakers will prefer a snazzy slide deck. That’s fine; if you’re in the public sector, you’re there to serve elected officials. But there’s a whole ton of work that goes on that doesn’t go to elected representatives that could be better articulated as a note. And, even if you’re ultimately going to put a slide deck together, I bet that it’s a lot better for having been born of a note first.

Amazon has done some interesting thinking on this that the public sector (and any large organisation) could potentially learn from. Jeff Bezos described it as the smartest thing they ever did at Amazon. Amazon banned slide decks altogether! This is too far, for a few reasons, but it shows just how seriously they take the time-wasting Powerpoint problem to be. They replaced them with “Memos”, essentially what I’ve called “Notes”.

Perhaps the most interesting addition that the Amazon model brings to what I’ve already suggested here is that meetings begin with a quiet period of around 30 minutes in which participants read the memo before engaging in discussion. I think this is a terrific idea, with big potential benefits. If you’re a senior leader in the public sector then, well, your diary is going to be full. Somehow, you’re expected to squeeze in reading important documents around a bulging-at-the-seams schedule–on most days, it’s going to be gone 11pm before you get to it. I can’t tell you how many meetings I’ve been to where many people, including myself, simply haven’t had time to read the paper, note, or memo ahead of time. That can lead to a poorer discussion, and poorer outcomes. Creating time within meetings to read memos ensures–quite literally–that everyone is on the same page, which will likely lead to better outcomes.

Another feature of Amazon-like memo models is that the participants in the decision that was reached co-sign the document and record why a particular decision was reached. This is really important for accountability and moving forward with clear agreement. (Verbal agreements are not as binding as you might hope, especially if people haven’t actually read the paper–encouraging people to put their name to a decision gives them more skin in the game and gives them incentives to ensure that it is a good decision.)

Knowledge data should be findable, accessible, interoperable, and re-usable

An organisation’s stock of knowledge should follow the FAIR principles: findable, accessible, interoperable, and re-usable. We’ll look at each of these and see how they suggest a database of Markdown documents as a likely back-end solution (a solution that happens to be free).

FAIR requirements

All of those benefits that come with standing on the shoulders of giants will only be available if knowledge is findable. In practice, this means ensuring staff have powerful search capabilities on hand. Sharepoint, with its very limited search capabilities (and its lack of full support for file types not covered by Microsoft Office products), will not do. Staff must be able to look for notes or documents within a certain date range. They need to be able to find all documents with specific words or phrases in. They should be able to browse documents using (preferably automatically generated) tags.

Of course, these documents must be accessible, both today and in the future. This means that they should be in a database that can be accessed easily from their computer, and which can be queried in milliseconds. There should be backups and system redundancy.

Interoperability is more important than it first appears. One public sector organisation I know of had terrible trouble because many of its documents were written in a propriety file format for a piece of software that has fallen out of favour. For new staff to use them, either the documents would have had to be converted (with potential information loss on the way), or the software contract renewed at considerable cost–even though no-one would be using it to write new documents! I put the Microsoft Office suite of file formats in this bucket as, to get the most from those types of files (eg .docx, .pptx), you really need to purchase Microsoft’s proprietary software. As an example of this lack of interoperability, there is no Microsoft Office support for Linux, the popular free and open source operating system. Interoperability prevents vendor lock-in too.

Re-usability is about people being able to dive in to the historical archive, grab what they need, and put it to work on a contemporary project. This rules out anything that doesn’t allow for easy copy and paste. So PDFs and similar are out. PDFs do have their uses because they do not change once created, but we don’t want to tap into that here. Re-usability once again pushes us toward a solution that looks a lot like plain text files because–no matter what whizzy developments there are in the future–it’s extremely likely that people will still be able to copy and paste from plain text files.¹ You might think this causes a problem for slide decks, in the cases where they are warranted: it doesn’t. You can write slide decks in plain text files too, using Markdown. To make images re-usable, you will want to either provide reproducible code to recreate a particular image or figure or to include the image as a separate asset. And, if your stock of knowledge is just plain text files and some image files, moving your entire stock of knowledge to a new system is as easy as copying and pasting everything.

¹ Though you do need to be careful about something called encoding. The TL;DR is that all text should be encoded as UTF-8.

It would also be nice if the technology behind all of this was free and open source. Note that free software does not mean totally free–you always need someone to maintain the software and database. But you’d need that with proprietary software too, so there’s still potentially a big pecuniary cost saving here if the tech is free.

Solutions to the knowledge data management problem

Taking all of these needs together, the solution looks a lot like a searchable database of plain text files with a friendly front-end. The most obvious candidate file format is Markdown. Markdown is written in plain text, which will never go out of style. It supports the inclusion of tables (written in plain text) and images provided as separate files, which helps with re-usability. Plain text Markdown files can also be used to generate slide decks, so this approach has that output type covered too. And of course Markdown is completely free, there are plenty of free editors for it, and almost all of the tooling you might need around it is free too. A slight variant on Markdown, Quarto Markdown, can support executable code chunks too–but don’t worry, it’s still all written in plain text.

Note that the solution doesn’t look like Google docs or Microsoft Office. These mix images and text. Their formats may change over time. They are proprietary. It’s not easy to throw them into a really flexible database (at least not with their current forms). There is vendor lock-in as it’s not easy to move them to a new system.

Markdown seems great, then, but there are some challenges with it that we should be aware of. We’ll examine the major ones:

editing Markdown documents will be alien to many, as will the way what you put in doesn’t look like you get out (ie it is not a WYSIWYG approach to text editing). Today, one of the best Markdown editors is Visual Studio Code, which is going to be overwhelming for staff unfamiliar with coding because it does a lot more than just edit Markdown and is really geared to coders. There needs to be a very friendly way to edit Markdown for people unfamiliar with coding.
one of the most useful aspects of the Microsoft Office suite, particularly the Powerpoint and Word products relevant to this blog post, is that you can collaborate on the same document (including with tracked changes). Git is one option for sharing and collaboration–and coders would be fine with this, but git from the command line is going to be too complex for any staff not au fait with coding. Furthermore, when editing a document with colleagues, the ability to provide comments (not in the doc itself) is incredibly useful. There needs to be a way to collaboratively edit documents and track changes in Markdown, possibly in real time, and ideally with the ability to provide comments. Ideally this should come with a way to set granular permissions.
there needs to be a searchable database of the existing stock of Markdown files and, preferably, a way to launch complex queries on them.

Under 1., there are a variety of paid and free markdown editors available. Perhaps I’m too pessimistic about people using Visual Studio Code to write Markdown. Ghostwriter is a cross-platform, free and open source alternative that is solely focused on Markdown so may be more user friendly. Other free and open source options include remarkable and abricotine. There are a couple of more snazzy looking paid versions, including typora and Obsidian (personal use is free, but commercial is not). One of these solutions seems like it would roundly knock out 1.

It’s likely that 2 and 3 could be solved together with subscription to a proprietary service. HackMD is a service that provides “real-time collaboration on markdown documents” that includes “granular note permission settings and private image storage”. It includes an editor (which would also help with 1) and it provides an ability to comment on docs too. It also supports tags that can be added via YAML header data–helping with long-term usability. It looks like a really good solution to 2. and 1., but like it does a bit less than would be ideal for 3; there is a free text search but it’s a “prime” feature and it seems like the other filters might be limited. It also seems like all the notes are in the vendor’s cloud, which makes building a custom search solution, and running compilation to other derivative file types (eg markdown to pdf, quarto markdown to slides), difficult–though there’s an ability to sync with Github. An alternative to HackMD is Obsidian, but it seems less feature rich–it doesn’t have real-time collaboration (or so it seems) and it introduces non-standard syntax, which is a threat to interoperability. Overall, HackMD seems like the best solution to 2.

While a service like HackMD can help with 3., it’s interesting to ask what else is out there that would work on a big ol’ pile of markdown (and image) data. Ideally, we would want a flexible, fast, comprehensive search on a bunch of markdown files and images. The free, open source version of ElasticSearch would be one potential solution–though note that the vendors are very keen for you to use their paid hosting service. An alternative that is also geared toward text files is Apache Lucene. Both of these would have to be hosted somewhere. One nice aspect of these purpose-built seach tools is that they can store logs of what people looked for, itself of use to the organisation. The query functionality of tools like Apache Lucene and ElasticSearch looks to be pretty good too. For example, ElasticSearch supports a SQL-like API and other complex query types. The most important aspect of 3 would be addressed by both of these open source solutions.

As an aside, advanced users could also pull out data more systematically from knowledge stored in a series of Markdown files. In that case, advanced users might pop everything into a tabular structure (eg a parquet file) and then query all rows with a high performance SQL query engine like DuckDB. Although working with text is always going to be tricky, DuckDB is astonishingly fast (check out polars too though). There is an extension for full text search for DuckDB.

If you want to get really fancy, you could also run something that would find connections between documents, displaying them as a graph. This could be useful to find inconsistencies, or links, in the way that topics are being dealt with across the organisation.

Summary

Good management of knowledge data is important to the success and efficiency of public sector organisations. The ideal is that all of the ideas, strategies, processes, and decisions relevant to an organisation and generated by its staff are available to search and to re-use in perpetuity. Although there are doubtless pros and cons to every approach, using “notes” (and not slide decks) as the unit of account for an organisation’s recorded knowledge is a very strong option. And storing those notes in a cloud-hosted database of Markdown files (plus assets, like images, that are used by those Markdown files) will have benefits such as avoiding vendor lock-in, ensuring content is re-usable far into the future, and ensuring that knowledge is easily searchable.