Creating a Qwen-Powered Lightweight Personal Assistant


Creating a Lightweight Personal Assistant Powered by a Qwen Language Model
Image by Editor | Midjourney

Introduction

The Qwen family of language models provides powerful and open-source large language models for various natural language processing tasks.

This article shows you how to set up and run a personal assistant application in Python powered by a Qwen model — specifically the Qwen1.5-7B-Chat model, which is an efficient and relatively lightweight 7-billion-parameter chat model optimized for conversational use cases. The code shown is ready to be used in a Python notebook such as Google Colab, but can easily be adapted to run locally if preferred.

Coding Solution

Since building a Qwen-powered assistant requires several dependencies and libraries being installed, we start by installing them and verifying installation versions to ensure compatibility among versions you might have pre-installed as much as possible.

We also set GPU use, if available, to ensure a faster model inference, the first time it will be called during execution.

These initial setup steps are shown in the code below:

Now it’s time to load and configure the model:

  • We use Qwen/Qwen1.5-7B-Chat, which allows for faster first-time inference compared to heavier models like Qwen2.5-Omni, which is a real powerhouse but not as lightweight as other versions of this family of models.
  • As usual, when loading a pre-trained language model, we need a tokenizer that converts text inputs to a readable format by the model. Luckily, the AutoTokenizer from HuggingFace's Transformers library smoothens this process.
  • To enhance efficiency, we try to configure 4-bit quantization which optimizes memory usage.

When building our own conversational assistant, it is typically a good practice to craft a default prompt that accompanies each specific request to adapt the model’s behavior and generated response to our needs. Here’s a specific default prompt:

system_prompt = """You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should be engaging and fun.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information."""

The following function we will define encapsulates the heaviest part of the execution flow, as this is where the model gets user input and is called to perform inference and generate a response. Importantly, we will run a conversation in which we can sequentially make multiple requests, therefore, it is important to manage the chat history accordingly and incorporate it as part of each new request.

Once the key function to generate responses has been defined, we can build a simple user interface to run and interact with the assistant.

The interface will contain an output display area that shows the conversation, an input text box where the user can ask questions, and two buttons for sending a request and clearing the chat. Notice the use of the widgets library for these elements.

Alternatively, we can also set up the option of using a command line interface (CLI) for the chat workflow:

Almost done. We will define two last functions:

  1. One for performing a quick test to ensure that both the model and dependencies are set up well.
  2. An overarching function to run the entire conversational assistant application. Here, the user can choose the kind of interface to use (UI vs. CLI).

Trying It Out

If everything has gone well, now it’s time to have fun and interact with our newly built assistant. Here is an example excerpt of the conversational workflow.

Running quick test...
Test Question: What can you help me with?
Response: 1. General knowledge: I can provide information on a wide range of topics, from history and science to pop culture, current events, and more.
2. Problem-solving: Need help with a math problem, figuring out how to do something, or troubleshooting an issue? I'm here to guide you.
3. Research: If you have a specific topic or question in mind, I can help you find reliable sources and summarize the information for you.
4. Language assistance: Need help with writing, grammar, spelling, or translation? I can assist with that.
5. Fun facts and trivia: Want to impress your friends with interesting facts or just looking for a good laugh? I've got you covered!
6. Time management and organization: Strategies to help you stay on top of your tasks and projects.
7. Personal development: Tips for learning new skills, setting goals, or managing your emotions.

Just let me know what you need, and I'll do my best to assist you! Remember, I can't always give away all the answers, but I'll certainly try to make the process as enjoyable and informative as possible.
Generation time: 18.04 seconds

Choose interface (1 for UI, 2 for CLI):

Below is an example of live interaction through the UI.

Qwen-based conversational assistant's UI

Qwen-based conversational assistant’s UI
Image by Author

Conclusion

In this article, we demonstrated how to build a simple conversational assistant application powered by a lightweight yet powerful Qwen language model. This application is designed to be run and tried out efficiently in a GPU setting like those offered by Google Colab notebook environments.

Leave a Reply

Your email address will not be published. Required fields are marked *