Fake it until you re-write it

I’ll be honest, writing a literature review wasn’t what I expected.

Equipped with an understanding of both the purpose and the structure, I first needed to understand what specific academic foundations the Under Cloud was built upon (at this point, I knew nothing about Crossref, OpenLibrary, and Semantic Scholar).

So, I asked the AI (in the application I use to write the code) to list academic papers that mapped to the notable aspects of the Under Cloud.

In seconds, substantial portions of the literature review appeared, with various claims, citations, and their corresponding references — I couldn’t help but think it was cheating, but by accident.

But, then I remembered what the AI had access to:

First, the entire code base, with its own extensive documentation (explanations of purpose attached to structures such as functions and so on);
Second, a mix of actual code and documentation comprised of Markdown files, where several were the product of in-depth conversations about ontological structures (domains, entities, semantic search and so on), the findings from the market research, the product-market fit, customer personas — the list goes on;
Third, the MCP server, allowing the AI to interrogate both the schema of application and some of the data (restricted to the development environment).

Let’s face it, this wasn’t a response plucked out of the magical beyond, because I’d written large parts of it!

Once I moved beyond the awkwardness of what had happened, I began to see notable gaps, such as the serendipitous adjacent search results.

The serendipitous what now? Under Cloud is a knowledge graph, and sometimes the semantic search doesn’t always surface what I need, but it does sometimes find assets adjacent to it. When following an adjacent asset I would find it was linked to the asset in question, and the annotated link would provide me with the essential context I’d long since forgotten.

It turns out this is known as information encountering, a phrase coined by Doctor Sanda Erdelez some time in 1999.

Also, the word berrypicking coined by Marcia J. Bates a decade before relates to the following:

Bates (1989), in her “berrypicking” model of information seeking, argued that real search behaviour is rarely a single query followed by a single result set. Instead, the researcher’s understanding evolves with each result encountered: each source found modifies the next query, and the path to the target is a branching walk rather than a straight line.
Marcia J. Bates.

Worth mentioning here is the entity analysis and how it supplements the semantic search.

Imagine searching “Elon Musk Peter Thiel” and finding 50 assets, the task then is to sift through them to find those relevant — a chore, no doubt, but there is an alternative.

Having extracted entities from the pertinent assets:

First, choose the entity type “People” and then select the aforementioned men;
Second, choose the entity relationship type “worked with” and search…

… instead of 50 assets, it’s possible you’d see less than 5 (depending on the entitles and their corresponding relationships). As an aside, there are dozens of different types of entity relationship.

Returning to the literature review, the AI surfaced a matching source and I then did a follow up to make sure it was relevant and accurate.

By this stage, I’d made substantial changes, but also using the Under Cloud as an assistant, as intended, allowing me to link both source material (derived from the appropriate DOI resources) and my own notes to their associated claims.

Of importance here are the links between assets, where each has a type of relationship (17 at the last count), and these types are crucial, in that “supports”, “references”, and “contradicts” are used by the Synthesis Document Framework when exporting the final knowledge product, while anything else is ignored.

Perhaps this arrangement seems complicated, but it does make much more sense when seen in action — but I digress…

With the able assistance of the AI, I’d written a literature review!

I was at this point entertaining the idea of publishing to an official journal … until the AI pointed out it would need a search methodology section, and that changed everything…

Photo by Thought Catalog on Unsplash