Managing LLM Attention Mechanisms | neuralgap.io

neuralgap.io

Determinism III: Managing Attention in LLMs

In this article, we will build upon our previous article “Determinism II: Prompt Templates and the Impact on Output” to try and understand how we can learn to increase determinism and control the output in relation to the input.

If you are enjoying our series so far, do follow us along our series on the Neuralgap website, where we will explore further on engineering challenges on the cross-section of generative AI and Big-Data.

Why do you need Determinism?

Determinism in LLMs is crucial for certain applications where precision and consistency are paramount. Consider a use case like Neuralgap’s Forager, our focused tool for performing intelligent data analytics. In this app, the output of the model needs to be highly reliable and consistent. Hence, single output inaccuracy can call the entire sequencing of data mining or analytics operation to fail.

However, it's important to recognize that the need for determinism varies greatly depending on the use case. In contrast to the precision-required scenarios like data analytics, there are domains where the inherent variability of LLMs can be beneficial. Creative writing and email composition are prime examples. In these contexts, the element of creativity and the ability to generate varied responses add value. For instance, in creative writing, an LLM's ability to come up with diverse ideas, styles, and narrative structures enriches the creative process.

This dichotomy highlights the need for tailored approaches in the development and deployment of LLMs, where the level of determinism is adjusted based on the specific requirements of the application.

General prompt templates archetypes

The prompt is the primary non-internal input that determines the output quality of the model. And that being said there i.e. a lot of literature out there on this. So instead, we will try to segregate into several high-level abstractions that are similar in nature to capture and show you how different LLMs can be used. The below list is highly generalized but attempts to give you an idea of a few methods that are used.

Guided Prompting: Basically chained prompting, where complex tasks are broken down into sequential sub-tasks, and Least-to-Most Prompting, which progressively addresses parts of a problem to build up to the solution. It is about leading a model through a task in a structured manner. For example, asking an LLM to first outline an essay and then expand each point into a paragraph.

Iterative and Adaptive Prompting: This is about refining prompts and strategies based on the model’s responses. It adapts to the model's strengths and weaknesses to improve output iteratively. For example, continuously tweaking a prompt about generating a news article until the tone, style, and content align with editorial standards.

Role-Based and Contextual Prompting: This technique involves setting a context or persona for the LLM to follow, providing clarity on the expected output. It is used to give the model a clear identity or set of instructions that shape the response. For example, asking the model to respond as a historian explaining a significant event, ensuring the language and content are appropriate for that role.

Strategic and Planning-Oriented Prompting: This category encapsulates Reasoning via Planning (RAP) and the Tree of Thoughts methods, where the model is prompted to plan or explore various reasoning paths to arrive at a solution. It’s about employing strategic thinking and decision-making in the model's responses. For example, instructing an LLM to devise a multi-step strategy for a marketing campaign, considering various customer touchpoints and expected outcomes.

Prompting for Determinism

Of course prompt engineering will to a large extent determine the vast majority of the variations in the output. We are going to take a look at 2 types of prompts that are very prevalent in generating deterministic outputs: Zero-shot and Few-shot (fine-tuning is not a direct form of prompt engineering and hence we are not focussing on this today). Let’s look at some attributes

Zero-shot Few-shot

Involves using LLMs for new tasks without prior labeled data
Draws from the probability distribution of the innate training

When to use:

To generalize from its broad training without bias from specific examples.

Useful for novel tasks or

To test the model's base knowledge.

Employs a few labeled examples in prompts to quickly adapt
Significantly influence the model's output distribution, guiding it towards the desired task while maintaining some level of flexibility

When to use:

Very specific tasks that the model may not perform well in a zero-shot setting

When you need to guide the model towards a specific format or content without extensive training

Sequencing
Sequencing will be how the model remembers, recalls, and executes the data. Sticking to our few-shot examples that are presented within prompts is vital. Changing the order or distribution of these examples can drastically alter LLM performance. The distribution of labels in few-shot examples should reflect the actual data distribution in the real world. Interestingly, the correctness of labels is not as crucial as their distribution. LLMs also exhibit a recency bias, tending to repeat the last few examples. Optimal data sampling for few-shot learning involves selecting diverse, randomly-ordered examples related to the test case. Learn more about practical prompt engineering.

In our next article - we will take a look at how we can optimize the model's internal parameters and build upon the concepts here to manage the attention of the LLMs better.

Interested in knowing more? Schedule a Call with us!

At Neuralgap - we deal daily with the challenges and difficulties in implementing, running and mining data for insight. Neuralgap is focussed on enabling transformative AI-assisted Data Analytics mining to enable ramp-up/ramp-down mining insights to cater to the data ingestion requirements of our clients.

Our flagship product, Forager, is an intelligent big data analytics platform that democratizes the analysis of corporate big data, enabling users of any experience level to unearth actionable insights from large datasets. Equipped with an intelligent UI that takes cues from mind maps and decision trees, Forager facilitates a seamless interaction between the user and the machine, employing the advanced capabilities of modern LLMs with that of very highly optimized mining modules. This allows for not only the interpretation of complex data queries but also the anticipation of analytical needs, evolving iteratively with each user interaction.

If you are interested in seeing how you could use Neuralgap Forager, or even for a custom project related to very high-end AI and Analytics deployment, visit us at https://neuralgap.io/