Skip to main content
Text classification is the process of assigning predefined categories or labels to text. It’s a foundational task in natural language processing and one that LLMs can perform with remarkable accuracy. You can use classification for a huge range of applications, including sentiment analysis of customer feedback, topic categorization of news articles, spam detection in emails, and intent recognition in user queries.

The Best Tool for the Job: Few-Shot Prompting

While you can sometimes get away with a zero-shot prompt for very simple classification tasks (e.g., “Is this review positive or negative?”), the most robust and reliable method is few-shot prompting. As we covered in the Core Principles section, few-shot prompting allows you to “teach” the model the exact classification system you want it to use. This is critical because classification is often subjective and context-dependent. By providing clear examples, you remove ambiguity and ensure the model’s output aligns with your specific needs.

From Simple to Complex Classification: A Case Study

Let’s look at how we can use few-shot prompting to build a sophisticated classifier for customer support tickets. Goal: We want to classify incoming support tickets into three categories: Technical Issue, Billing Inquiry, and General Question. A Good Few-Shot Prompt:
Please classify the following customer support tickets into one of three categories: Technical Issue, Billing Inquiry, or General Question.

Ticket: "Hi, I can't seem to log in to my account. I've reset my password but it's still not working."
Category: Technical Issue

Ticket: "Hello, I was wondering if you offer any discounts for non-profit organizations?"
Category: General Question

Ticket: "I think I was overcharged on my last invoice. Can you please check?"
Category: Billing Inquiry

Ticket: "My dashboard is showing an error message and I can't access my reports."
Category:
This prompt is effective because it provides one clear example for each category. The model will see the pattern and correctly classify the final ticket as Technical Issue. An Advanced Prompt with Edge Case Handling: Sometimes, a ticket might fit into more than one category. We can teach the model how to handle this.
Please classify the following customer support tickets. You can assign one or more of the following categories: Technical Issue, Billing Inquiry, General Question. Format the output as a JSON array.

Ticket: "Hi, I can't seem to log in to my account. I've reset my password but it's still not working."
Category: ["Technical Issue"]

Ticket: "I think I was overcharged on my last invoice. Can you also tell me what your business hours are?"
Category: ["Billing Inquiry", "General Question"]

Ticket: "My dashboard is showing an error message and I can't access my reports. This is preventing me from upgrading my account, which I'd like to do today."
Category:
This is a much more sophisticated prompt. We’ve instructed the model to handle multiple categories and to output the result in a machine-readable JSON format. We also provided a tricky example that combines a billing and a general question. The model will now correctly classify the final ticket as ["Technical Issue", "Billing Inquiry"].

A Toolkit of Classification Techniques

  • Chain of Thought Classification: For very complex classification tasks, you can ask the model to “think step by step.”
    • For the following user comment, first identify the main topic of the comment, then decide if the sentiment is positive, negative, or neutral. Finally, assign one of the following categories...
  • Fine-Grained Classification: Don’t be afraid to use a large number of categories. LLMs can handle dozens or even hundreds of categories if you provide clear examples.
  • Confidence Scoring: For more advanced use cases, you can ask the model to provide a confidence score for its classification.
    • Classify the following text and provide a confidence score (from 0.0 to 1.0) for your answer.
By leveraging few-shot prompting and these advanced techniques, you can build powerful and nuanced text classifiers for almost any application.