Implicit vs Explicit Knowledge models

Sangeetha Venkatesan
LMAO Learning Group
8 min readJan 25, 2023

--

Moving to the scope of Knowledge augmentation from data augmentation methods

The technical term for this: Retrieval Augmented Language Models

Apart from the problem of fitting a model with the data, there is a problem with fitting the model with the use case!

Photo by Max van den Oetelaar on Unsplash

There is an area of Question Answering problem called “Long form question answering” that involves retrieval as well as generation. Though there might be some adversaries on the knowledge condensed on the text generation part, there is a bit of hesitance in productionalizing these models without having a proof of concept of controlled monitoring and evaluation. Even though there are paradigms of “instruct and command models” coming up, “dialogue-interaction” is the heart of NLP, having a companion of interaction within the client persona.

There is always a business aspect that needs to be thought of far from the one-yard line and be comfortable with resurfacing the whole pipeline as and when needed and proving that our answers to certain questions might be wrong!

In one of the podcasts, Noam Shazeer — founder of character.ai mentioned the flexibility with startups to “Get something out there, let people use it however they want”, there is always a character of interaction and feedback that people are expecting with NLU.

The expertise of domain knowledge is getting dense in each sector and companies want the model to be as close to the domain knowledge in particular to the client database knowledge as possible.

Consider the utterance “Wire transfer with our organization is easy and the fastest way” generated by a text generation model fine-tuned on a client’s FAQ, there is a ground truth confidence that needs to be established, as the “implicit knowledge” is wire transfer being expensive and cumbersome.

The ground truth of text generation is different with respect to different clients and their emotional quotient to content writing. There are different metrics to evaluate both the retrieval( Precision and Recall) and generation quality(ROUGE score — which is the recall variant of the BLEU score(Bilingual Evaluation Understanding). There is always a better human reference text needed to evaluate the models from time to time. These metrics also could not be a perfect fit to compare the model-generated answers with human-reference answers.

There is always a variance of “ambiguity in acceptability” even within the organization or with the stakeholders. There is always a penalty (because of random n-gram overlap) that comes along with it.

Let's look at some of the external knowledge composability with Large language models: Two of the most important components in conversational AI are handling conversation sessions (long context memory) and history coupled with sourcing external knowledge(data store) without having it be learned as implicit knowledge.

Chain of conversation thoughts

I really like the idea of coming up with a reasoning chain as a good logging system for NLP applications in addition to software engineering loggers.

This also has a scope of maintaining a single conversational AI engine able to serve multiple clients' data (without cross-pollination) that plugs in action from different tools depending on the context of the customer. Though this “unified conversational AI” approach takes a long way because we are striking a balance between generalization vs specialization for a client.

Very nice way of plug and action in the analogy of brain and models — In the scope of conversational AI, it will be reasoning, reading and comprehension, Understanding, language, memory, and learning! (keeping it to text data for now)

Source: LifeArchitect.ai/brain
chain of conversation thoughts that needs external data store knowledge
Preserving the context of the topic of the previous question — Source: https://huggingface.co/spaces/JavaFXpert/Chat-GPT-LangChain

Exploration of different conversational dialogue models:

An added component to make the interaction less biased is to have RLHF(Reinforcement learning with Human Feedback), with good-quality hate speech detection! Another important aspect is “Keeping the conversation thread” in memory and to also overcome the limitation of prompts and context limited in tokens to process the entire array of conversations through LLM.

  1. Cohere (Command Models, generate models)
  2. DeepMind (Sparrow, chinchilla)
  3. Google (LaMDA)
  4. Meta (BlenderBot 3)
  5. OpenAI (InstructGPT, GPT-3, chatGPT)

Some of them at different levels are augmented with Instruction fine-tuning Safety enforcement and supervised fine-tuning.

Chain of thought prompting: It's the annotation on how reasoning is cascaded by humans modeled via a chain of thought prompting. It's a good choice to make the conversation with little extra acknowledgment and consideration from the bots when the customer panics with “my credit card is lost” than a plain FAQ response!

Right now, the source of computational knowledge of LLM that’s prevalent is mostly around fine-tuning the models with domain knowledge datasets. This is almost on client-specific conversational-AI service providing works pretty well since the extra knowledge we have augmented with is the one we are conditioning the text generation on!

There is a trade-off between doing the natural language search (embeddings search) with LLM (fine-tuned) and generating the answer conditioned on the retrieved contexts vs translating the query to be conducive to the third-party knowledge tool like WolframAlpha and getting the response back and using Text generation to make it more conversational. It totally depends on the use case. Consider a use case where we want the model to give a factual answer on what SSN is + generates answers on card queries based on the fine-tuned LLM. Toggle between LLM and Factual on certain conditions in conversational AI business is worth researching!

There is also a paradigm called meta-learning (0, 1, or few shot) — that revolves around performing zero-shot learning (instruction + input) to generate the answer for the particular task we would like the model to perform! Scaling to be restrictive and fenced in the business domain is tedious.

Abstraction is a much-needed component when working with LLM which drills down to composability in applications. Two new abstractions I got introduced to this week are Langchain and co.chat from Cohere. We are talking about three aspects here, implicit knowledge or memory, explicit or factual knowledge of the current world, and contextual memory which is the powerhouse of the entire conversation within a session!

  1. Langchain
  2. co.chat
  3. Long context memory

A quick recap on dialogue systems: Most applications of chatbots drills down to data-driven open-domain question answering. Usually, there is a categorization of chatbots into task-oriented and non-task-oriented use cases. But with the introduction of a third-party engine + NLU engine, categorization will be more towards which needs a context switch and which doesn’t need. Systems were mostly built on a scalable backbone of intent detection and slot filling as models were expected to perform better with more data (might not be true*). It suffers from a major loophole of credit assignment problem where each of the user feedback is hard to be propagated to the fine-tuned model. There is a global optimization needed for connecting the changed components.

In terms of response generation, perplexity is an important factor to consider to have humans understand it in a simplified way! There is also a possibility that the incoming customer conversations don’t share distributions of what training data or chatbot is trained to support. Having endpoints to capture these distributions in conversational AI is very important, otherwise, business needs and the product strategy stay apart without realizing it!

User query —> NLU → Maintain the context state in memory → Surface actions prompting the needed knowledge ( implicit or explicit) based on the additional meta data inferred -> Generation -> User query with collected feedback -> cycle continues.

There are generative-based dialogue methods and also very neat aspects towards retrieval augmented methods. Hybrid methods are more important in the business aspect of conversational AI since both precision and discourse is effective for the customers.

🚀 co.chat: An experimental release by Cohere to the discord community. The main focus is on making the calls stateful so the conversation context goes coherent with generation. Apart from thinking about the knowledge of chatbot, the stateful and continuity are important factors. It’s a kind of neural generative model approach conditioning the response on stateful conversations.

Dialog management is a crucial aspect, with LLM’s keeping it contextual is added key idea for better and relevant text generation. We should also experiment with the usability of chat logs and having it influence much might also lead to random generation. It provides the ability to keep conversations active and engaging. In the question answering task, it is kind of tracking the context-key relevance pair.

The current endpoint is set to the baseline generative model, so the test is mostly on how it handles conversation continuity and not the generations.

The initial project of cohere was around conversant-lib that creates personas with prompt engineering with cohere generation models maintaining the sessions and contexts of the questions. It's fed with a round of conversations to create the persona of the chatbot. I am more interested in contextualizing the question and having a long-term memory of sessions and conversations. It's also worth experimenting with how much chat history(how many conversation rounds) is needed to serve customers efficiently. It's an open-ended conversant library that needs some fine-tuning with prompts to restrict to certain specific use cases.

The advantage of having a stateful conversation is to break the response selection only based on the single turn in the conversation. In the preliminary release, it supports multi-turn response matching combining the previous utterances and current utterances as input in order to ensure conversation consistency.

First, look! 🚀

  1. I liked the fact that a big chunk of the dialog track, the history of each user's conversations is associated with some kind of store that’s abstracted. But I do love to see how context is encoded and maintained in order to plugin and remove not relevant utterances in the context.
  2. This would also refrain from worrying about maintaining the entire dialogue in some prompt in order to condition the next action.
  3. Collating all conversations under a session ID and inferring the continuity of responses is easy to analyze for getting an easy picture of conversation per user with the bot.
  4. Keeping it simple is key!
Response from co.chat
co.chat in streamlit

Scope: ✍🏻

  1. Persona: I really like the way there is a scope of starter conversation ID that might set the ground for the conversations based on the user profile. This is highly valuable in business settings where the bot can chat about business credit cards and customer credit cards based on a different persona.
  2. Signal to context switch to external vs internal knowledge. This is very helpful for some good answers that the model could grab based on the level of explanation the user needs.
  3. Prompting and generating in the domain by plugging in the context memory with domain knowledge data.

I am looking forward to exploring and combining the above different aspects of conversational AI to experiment as singular components.

Until then,

Keep learning — Sangeetha Venkatesan

--

--

Sangeetha Venkatesan
LMAO Learning Group

NLP Engineer, Language actors talking the stage with large language models!