Why Vector Databases & Embeddings Used for Prompt Context are Limited for Personalized AI Assistants

Image generated using You.com

As an artificial intelligence enthusiast who has been working in the technology and media industries all my career, I have seen the rise of personal AI assistants and their increasing popularity. In May 2023, I developed and released Ragbot.AI, an open source AI assistant powered by large language models.

Like other AI assistants powered by large language models (LLMs), Ragbot.AI uses Retrieval Augmented Generation (RAG) which is a technique that combines the power of pre-trained dense retrieval and sequence-to-sequence models to generate more factual and informative text.

Note: I have updated this blog post to include some counterpoints to my views made by a colleague.

Let me describe what I mean by a prompt decorator. This term is partly inspired by a similar concept of LangChain Decorators. An alternative term I could have used would have been Prompt Context.

Prompt Decorator: Additional context provided along with a prompt to aid an AI system in responding. The context may include related text, images, audio or other data types that help the system better interpret the intent and meaning behind the prompt. The aim of a prompt decorator is to augment the information available to the AI, enabling it to generate a more relevant, personalized response. In the case of a personalized AI assistant, prompt decorators could include parts of a user’s personal communications, online data or other information unique to that individual which helps the assistant gain a deeper understanding of the user’s experiences, interests and way of communicating.

A prompt decorator helps ensure an AI system has sufficient context to respond insightfully. It decorates or enhances the base prompt by providing extra information the system can use in shaping its response. For truly personalized systems, prompt decorators give models access to details about a specific user which they can learn from over time to become increasingly tailored to that individual. As AI continues its rapid development, prompt decorators serve as a key mechanism for enabling models to respond with understanding that matches human depth and nuance.

However, many of these assistants have limitations in their ability to answer insightful questions. This is partly because they often rely primarily on vector databases and embeddings to represent and process text data, relevant portions of which are extracted and used as prompt decorators given to the large language model along with the user’s prompt.

To be clear, I am not saying that vector databases and embeddings are not a great solution. I’m a fan of using vector databases and embeddings. What I’m arguing is that the current methods for matching related text in them to use as prompt decorators with an LLM for personalized AI assistant uses cases is not a satisfactory solution.

Personalized AI assistants have the potential to transform our lives by providing insights and guidance based on our personal experiences. Imagine a tool that could sift through decades of your personal journals, emails, text messages, blog posts, and social media posts.

This AI assistant could then answer profound questions about your life, such as:

What are some life lessons I have gleaned from my interactions with family, friends, and colleagues?
What practical advice have I given to others that has proven effective?
What are the biggest mistakes I’ve made, what lessons did I learn from them, and what else should I learn from them?
What achievements am I most proud of, and what lessons can I draw from them to guide my future actions?
Given my strengths, weaknesses, goals, and aspirations, what kinds of roles, jobs, and work environments would be the best fit for me and allow me to maximize my impact?
What do my learning styles and preferences suggest about the ways I can most effectively acquire new knowledge and skills?
How does my communication style influence my relationships and effectiveness in collaborative efforts?
How do my values and beliefs align or differ from the prevailing cultural attitudes around me?
What do patterns in my emotional states and reactions reveal about my psychological and mental well-being, and what steps could I take to cultivate healthier habits of thinking and feeling?

These are some examples of the types of questions that can be answered by a personalized AI assistant. By integrating text data with a large language model, these assistants can provide users with insights that they would not be able to obtain on their own.

While this sounds like a promising application of AI technology, it is not something that can be effectively achieved using prompt decorators extracted from vector databases and embeddings, a common approach in natural language processing (NLP). This blog post will explain why these methods fall short and explore how other techniques, such as fine-tuning large language models (LLMs), offer a more promising route towards the creation of truly personalized AI assistants.

Understanding Vector Databases and Embeddings

Vector databases represent text data as vectors of numbers. These vectors aim to capture some aspects of the semantic meaning of the text and can be used to measure the similarity between different pieces of text. Embeddings are a type of learned vector representation trained on large datasets of text. Embeddings are commonly used for natural language processing tasks like machine translation or question answering.

In the world of NLP, vector databases and embeddings are common tools. They work by mapping words, sentences, or entire documents into a high-dimensional vector space, where the semantic meaning of the text is represented by the position of the vectors. These vector representations can be used to measure the similarity between different pieces of text, making them useful for tasks such as text classification or semantic search.

Text embeddings are vector lists of floating point numbers that represent the semantic meaning of text strings. Text strings are considered very related if the vector list distance is small and very unrelated if the distance is large. Vector databases use these embeddings to represent data in numerical values and arrange them in clusters that are similar to each other.

However, while these techniques can capture some semantic meaning, they are often inadequate for the type of deep understanding required for a personalized AI assistant.

Limitations of Vector Databases and Embeddings

In a recent blog post titled Why AI Will Make Traditional Databases Obsolete, I argued that vector databases are paving the way for AI-driven data management.

Vector databases and embeddings are effective for certain tasks, such as finding similar documents or predicting the next word in a sentence. However, they are not well-suited for answering insightful questions. This is because they do not have the ability to understand the context of the text. For example, a vector database might be able to identify that the words “mistake” and “learn” are semantically similar. However, it would not be able to understand that the question “What are some of the biggest mistakes I have made?” is asking for a specific type of information.

There are several reasons why vector databases and embeddings are insufficient for creating a personalized AI assistant capable of answering the types of insightful questions outlined above.

Lack of Contextual Understanding: While embeddings can capture some semantic relationships, they struggle with understanding context. This is especially true for word embeddings, where each word has the same representation regardless of its context. Even more advanced techniques like sentence or document embeddings have limitations in capturing complex, nuanced meanings that depend on the larger context.
Difficulty Handling Long-Term Dependencies: Text embeddings are not well-suited to handle long-term dependencies or to connect ideas across large blocks of text. This makes them ill-equipped to sift through decades of personal data and draw meaningful conclusions about life lessons, personal growth, or recurring themes.
Inadequate Representation of Personal Experiences: Embeddings are typically trained on large, generic corpora of text, which means they may not adequately capture the unique language or experiences contained in personal texts. They can struggle to understand personal idioms, references, or unusual uses of language that may be prevalent in personal journals or emails.

Solutions

Solutions for creating personalized AI assistants with a deeper understanding of context and personal experiences could include:

Fine tuning: This involves training the LLM on a dataset of text that is relevant to the specific user. This allows the LLM to learn the user’s vocabulary and style of writing.
Customized training: Instead of using pre-trained embeddings, train the AI model on the user’s specific dataset from scratch. This would allow the model to learn the unique language use, idioms, and personal experiences of the user more effectively. However, this approach may require more computational resources and time.
Transfer learning: Start with a base model trained on general text data and then fine-tune it on the user’s specific dataset. This approach combines the advantages of using a pre-trained model with the benefits of customization. This way, the AI assistant learns general language understanding from the base model and adapts it to the user’s personal context.
Multi-modal approaches: Incorporate other types of data, such as images, audio, or video, into the AI assistant’s understanding. Using various data formats could provide a richer understanding of the user’s experiences and allow the AI assistant to provide more accurate and meaningful insights.
Hierarchical models: Design AI models that can analyze text data at different levels of granularity, such as words, phrases, sentences, paragraphs, and documents. This can help the AI assistant better understand the relationships between different parts of the text and derive more meaningful insights.
Graph-based approaches: Use graph-based techniques to connect and analyze entities, concepts, and relationships within the user’s personal data. This can help the AI assistant identify patterns, trends, and recurring themes in the user’s experiences.
Continuous learning and feedback: Implement a feedback loop that allows the user to provide feedback on the AI assistant’s responses, helping the AI assistant to learn from its mistakes and improve its understanding of the user’s personal context over time.

By exploring these alternative solutions, AI developers can work towards creating personalized AI assistants capable of providing deeper insights and a more meaningful understanding of the user’s life experiences.

The Promise of Fine-Tuned Large Language Models

A better solution for personal AI assistants is to integrate the text data with a large language model (LLM). LLMs are trained on massive datasets of text, and they have the ability to understand the context of the text. This allows them to answer insightful questions in a comprehensive and informative way.

One way to integrate text data with an LLM is to fine-tune the LLM on the text data. This involves training the LLM on a dataset of text that is relevant to the specific user. This allows the LLM to learn the user’s vocabulary and style of writing.

In contrast to vector databases and embeddings, fine-tuned large language models (LLMs) offer a more promising approach to creating personalized AI assistants. LLMs are neural network models trained to predict the next word in a sentence, learning a rich understanding of language in the process. Fine-tuning refers to the process of further training these models on a specific dataset, allowing them to adapt their learned language understanding to a new context.

When it comes to creating a personalized AI assistant, there are several reasons why fine-tuning LLMs is a superior approach:

Improved Contextual Understanding: LLMs excel at understanding context. They can interpret the meaning of a word based on the words around it, and can understand complex, nuanced meanings that depend on larger contextual cues. This allows them to understand and generate text that is more coherent and contextually relevant.
Handling Long-Term Dependencies: LLMs are designed to handle long sequences of text, making them well-equipped to understand and draw connections across large amounts of personal data. This enables them to answer questions about life lessons, personal growth, or recurring themes that span across many years or decades.
Ability to Learn Personalized Language Use: Fine-tuning allows LLMs to adapt to the specific language use and experiences contained in personal texts. They can learn to understand personal idioms, references, or unusual uses of language, making them a better fit for interpreting and generating responses based on personal data.
Iterative Learning and Improvement: LLMs can continue to learn and improve over time. As they are exposed to more of your personal data, they can refine their understanding and generate more accurate and insightful responses.

Addressing Counterarguments

Some may argue that vector databases and embeddings could still be sufficient for personalized AI assistants if engineered properly. It is true that continued progress in these techniques may lead to improved performance on complex NLP tasks. However, there are several reasons why LLMs are inherently better suited for this application:

Vector databases and embeddings rely on static vector representations, while LLMs employ a neural network architecture that can adapt and learn over time. This ability to dynamically change based on new data gives LLMs a fundamental advantage for modeling users and personal experiences.
LLMs are designed to handle long-term dependencies in language, while embeddings struggle with connecting ideas across large spans of text. The types of meaningful insights a personalized AI assistant should provide depend heavily on identifying relationships and patterns across a user’s lifetime of experiences – something LLMs are uniquely suited for.
Fine-tuning allows LLMs to adapt to highly specific, personalized language use that would not be adequately captured using generic embeddings. An AI system aimed at understanding an individual must be tailored to that individual, rather than relying on general representations of language.
LLMs have been shown to outperform embeddings and static vector representations on a variety of NLP tasks, especially those requiring deeper, more nuanced language understanding. If the goal is an AI system that can provide truly insightful responses, LLMs offer clear benefits over alternative techniques.

While continued progress in vector representations and embeddings will undoubtedly lead to improvements, LLMs are fundamentally better aligned with the capabilities needed for personalized AI assistants. They will likely continue to surpass alternative techniques as our models and datasets become more advanced. For these reasons, LLMs and fine-tuning are the most promising approach to building AI systems with a deep, personalized understanding of users and their experiences.

A Superior Approach for Personalized AI

Vector databases and embeddings have significant limitations for creating personalized AI assistants that can provide meaningful insights. They lack the contextual understanding, ability to handle long-term dependencies, and capacity for personalized learning that is required to interpret decades of life experiences.

Large language models offer a fundamentally better solution. They are designed to understand language in a highly contextual, nuanced way and continue learning over time. By fine-tuning these models on personal data, they can adapt to understand a user’s unique way of communicating and develop insights tailored to their life journey.

While continued progress will be made, LLMs align more closely with the capabilities needed for personalized AI assistants. They surpass alternative techniques in modeling the complexities of human language and experiences. For these reasons, large language models and fine-tuning should be the primary focus in developing AI systems with a deep, empathetic understanding of individual users.

By leveraging the strengths of LLMs, we can build AI assistants that transform how we understand ourselves – gaining profound insights into our values, goals, relationships, and the meaning of our life experiences. This exciting vision highlights the promise of artificial intelligence to enrich our self-knowledge and help us live more purposeful, intentional lives. LLMs are the key to achieving that vision and developing AI systems that can nurture our personal growth in a deeply human way.

Addressing Thoughtful Feedback

I would like to thank my colleague Alexandria Redmon for providing constructive criticism and feedback on my arguments in this blog post. Alexandria raised several insightful counterpoints around the role of embeddings in language models, the ability to combine multiple techniques like fine-tuning and prompt augmentation, and some of the challenges of developing truly personalized AI assistants.

Here is my interpretation and analysis of Alexandria’s points:

She made the argument that embeddings are fundamental to large language models like GPT, and cannot be separated from them. She is correct. My view is that while embeddings are used within LLMs, fine-tuning the models and providing contextual data about the user is what enables truly personalized responses. Embeddings alone do not seem sufficient for some of the use cases I envision, but I am still learning as I develop Ragbot.AI.
She notes that fine-tuning and prompt context augmentation using embeddings are not mutually exclusive and can be used together. I agree with her on this, and in my blog post I argue that a multi-modal approach utilizing both techniques may be most effective. However, my key point is that embeddings are limited, and fine-tuning is required to adapt models to a user’s unique experiences and language.
She points out that OpenAI’s models cannot currently be fine-tuned. While this is true, my arguments are focused on the approaches that will be required to build the most capable personalized AI systems, not just what is currently possible with OpenAI’s models. As the field progresses, fine-tuning and customization will become increasingly important.
She questions how I intend to employ fine-tuning to address the types of insightful questions I outlined. My response is that fine-tuning, especially when combined with other approaches like customized training or transfer learning, would allow models to learn the patterns, themes, and semantics present in a user’s personal data to eventually answer such questions. This would be an iterative process and would likely start with more basic questions, but the ability to understand personal context would grow over time through a feedback loop with the user.
She notes the challenges of providing a model with all of a user’s data at once and questions how it would develop insights from that. I agree this is an open challenge, but believe that a personalized AI system’s capabilities would need to develop over time through continuous learning and feedback. Starting with a base level of understanding which is then gradually customized and improved upon is a more realistic approach than instantly deriving profound insights from massive datasets.
She requested specific examples of how I would propose to implement a fine-tuning strategy to address the use cases I outlined. While I do not yet have exact details for implementation, my overall vision would be: provide the model with portions of the user’s data over time, receive feedback to further improve the model, and gradually progress to more complex questions and types of data as the model’s capabilities expand. The specifics of training data and methods would depend on the particular use case and available datasets for each individual user.
She notes that prompt decorators terminology may need to be defined, as it is not commonly used outside of their own work. I agree and appreciate the feedback. My use of that phrase was meant colloquially to refer to the additional context provided along with a prompt, but a definition or alternate phrasing would have been helpful for clarity. Thanks to her feedback, I updated the post above to include a definition as well as an alternate phrasing (prompt context).

In summary, Alexandria raised valid points and counterpoints to my post. My key argument is that embeddings alone are limited and that fine-tuning and customization will be required for truly personalized AI assistants, but I am open minded changing my opinion as I learn more in the process of my development of Ragbot.AI. My vision is an iterative process where models gradually become attuned to individual users over time, not an instant solution. I appreciate Alexandria’s thoughtful feedback, debate, and discussion on this topic.

This is my opinion

As explained my site’s README page, Like most of the content on my blog, this is my opinion. As the world changes and I learn more, my viewpoints evolve.