Build your own LLM for sentiment analysis, it's free!

Nowadays, we don't have to learn complex programming, or pay a huge amount of money to hire someone to categorize user feedback. It's super easy to set it up on our own laptop. I only spent an afternoon to build it, and with this instruction, you can build it in 30 minutes.

So, build your own LLM for sentiment analysis, it's free!

I'm genuinely amazed at how powerful and versatile it is. The speed at which I can analyze large volumes of text is incredible – what used to take hours now happens in mere seconds. I love the fact that I have complete control over the system, allowing me to fine-tune it for my specific needs. Whether I'm analyzing customer reviews for my small business or processing social media comments for a research project, the tool adapts seamlessly. The privacy aspect is a huge relief too. Knowing that all the data stays on my machine gives me peace of mind, especially when dealing with sensitive information. Perhaps what excites me most is the potential for further customization. I've already started experimenting with different prompts and models, and the results are promising. This project has not only solved my immediate need for sentiment analysis but has also opened up a world of possibilities in natural language processing that I'm eager to explore further.


Using LLM for sentiment analysis

Sentiment analysis, the process of determining the emotional tone behind a series of words, has become increasingly important in understanding customer feedback, social media monitoring, and market research. Large Language Models (LLMs) have revolutionized this field by offering more nuanced and context-aware sentiment analysis compared to traditional rule-based or machine learning approaches.

LLMs can understand complex language patterns, sarcasm, and context, leading to more accurate sentiment classification. They can also handle various languages and dialects with ease, making them versatile tools for global sentiment analysis tasks.


Advantages of building LLM agents on your own computer

While cloud-based LLM services are readily available, building your own LLM agent for sentiment analysis on your local machine offers several advantages:

1. Privacy and data security: Your data remains on your computer, reducing the risk of data breaches or unauthorized access.

2. Cost-effective: No need to pay for API calls or cloud computing resources (unless you want to).

3. Customization: You have full control over the model and can fine-tune it for your specific use case. You can pick a large model with better accuracy or a small model with faster speed. Your call.

4. No internet dependency: Your sentiment analysis can run offline, ensuring reliability.


Demonstration

Let's walk through a quick demonstration of how to use a locally hosted LLM for sentiment analysis.

I'll use a small dataset of Google Map comments I scrapped for our demonstration:



Here's a simple Python script to perform sentiment analysis using our local LLM:



Here's a simple Dify workflow to perform sentiment analysis:


In fact, it's too easy to deploy. As long as you can come up with a prompt, setting up the whole workflow will probably cost you 5 minutes! 

Here's a sample prompt I used:

````````````````````

Act as a natural language processing software. Your primary function is to analyze the sentiment expressed in a given text. Classify the overall sentiment as one of the following:

1. Positive: The text expresses happiness, joy, optimism, or other positive emotions.

2. Negative: The text expresses sadness, anger, disappointment, or other negative emotions.

3. Neutral: The text is objective, factual, or does not express any clear emotion.

4. NA: No text is provided.


Focus on the overall emotional tone rather than individual words or phrases. Consider context, sarcasm, and other nuances that may impact the sentiment. If you are unsure, choose "Neutral". No explanation is needed. Only output a single word.

<Example Input>

I'm so excited about my new job!

<Example Output>

Positive

<Example Input>

The weather is terrible today.

<Example Output>

Negative

<Example Input>

The meeting is scheduled for 3 pm.

<Example Output>

Neutral

<Input>

{{#1723625445021.input_text#}}

<Output>

````````````````````````

You may notice that I also included the few-shot prompting technique to improve the results. This may be effective in improving small local models, but take up too many tokens in the context window if there's any limits on that. You can experiment with it if you like.


One-click run them all!

In addition to analyzing individual comments, our sentiment analysis tool can handle batch processing using CSV files. This feature is particularly useful when dealing with large datasets or when you need to integrate the tool into existing data pipelines. We've enhanced the Python script to read comments from an input CSV file, perform sentiment analysis on each comment, and then write the results to a new CSV file. The output file will contain all the original data plus a new "sentiment" column, making it easy to review and further analyze the results. This batch processing capability significantly improves the tool's efficiency and usability, allowing you to analyze thousands of comments with just a single command. Whether you're processing customer feedback, social media comments, or any other text data, this CSV handling feature makes our sentiment analysis tool even more powerful and versatile.

To run this script, save it as `analysis.py` and execute it from the command line:

```

import pandas as pd
import requests
import json

url = "http://localhost/v1/workflows/run"

headers = {
'Authorization': 'Bearer app-uME8KergXdFrsxFbSrQLPj8k',#enter your API key
'Content-Type': 'application/json',
}

def analyze_sentiment(text):
data = {
"inputs": {"input_text": text},
"response_mode": "blocking",
"user": "user" #enter your user name
}
response = requests.post(url, headers=headers, data=json.dumps(data))
json_data = response.json()
return json_data['data']['outputs']['result']

chunksize = 5 # Adjust as needed for your data size
for chunk in pd.read_csv('data_file.csv', chunksize=chunksize):
chunk['result'] = chunk['Review'].apply(analyze_sentiment)
chunk.to_csv('results_file.csv', mode='a', header=False, index=False)

```

This will analyze each comment and output the sentiment classification in one single result file.

The results file is straightforward with an extra column showing the category of the comments:


*sometimes when I used small models like gemma2 or phi3.5, the LLM may produce extra content such as explanations.


Build-up Instructions: Tools we need

To build our local LLM-based sentiment analysis system, we'll use the following tools:

1. **Ollama**: An open-source tool for running large language models locally. It simplifies the process of downloading, setting up, and running various LLMs on your machine.

2. **Dify**: An open-source LLM workflow platform that helps in creating, deploying, and managing AI applications. We'll use it to create our sentiment analysis API endpoint.

3. **Python**: We'll use Python to write our script for interacting with the LLM and processing the results.

All tools above are free to use! You can easily grab them from their official website.


Setup instructions

1. Install Ollama:

Visit the Ollama website (https://ollama.com/) and download the appropriate version for your operating system. Follow the installation instructions provided.

2. Pull a suitable LLM model as you like:

For example,

   ollama pull llama3.1

3. Install Dify:

(install python and git if you haven't)

   - Clone the Dify repository: `git clone https://github.com/langgenius/dify.git` 

   - Follow the installation instructions in the Dify documentation.

4. Set up a new application in Dify:

   - Create a new app / workflow.

   - Configure the app to use your local Ollama instance as the LLM provider.

   - Set up a prompt template for sentiment analysis. You can customize your own prompt or just simply use mine (simply import the DSL file).

5. Install required Python packages:

You probably want to install the following packages

   ```

   pip install requests

   pip install json

   pip install pandas

   ```

6. Run the Python script as shown in the demonstration section above.


Maybe? Little tweak to use powerful proprietary LLMs

While open-source models are great for getting started, you might want to leverage more powerful proprietary LLMs for enhanced performance. Here's how you can tweak your setup:

1. Sign up for API access to proprietary LLMs like OpenAI's GPT-4 or Anthropic's Claude.

2. In your Dify application settings, switch the LLM provider from Ollama to the proprietary LLM of your choice.

3. Update your Python script to include the necessary API authentication (if you create a new workflow):


```python

headers = {

    "Content-Type": "application/json",

    "Authorization": "Bearer YOUR_API_KEY_HERE"

}

```

4. No additional steps! You can simply re-run the script!

By following these steps, you can easily switch between open-source and proprietary LLMs, allowing you to compare performance and choose the best option for your specific use case. Remember to always comply with the terms of service and data usage policies when using proprietary LLMs.

Comments

Popular posts from this blog

What is Tang Ping? Investigating the Social Media Phenomenon Through Natural Language Processing

Using LLMs on Your Phone Locally in Three Easy Steps

Shifting Preferences of Mainland Chinese Tourists' Interests from Luxury to Budget Experiences in Hong Kong - Data From Little Redbook