Recently, I’ve discovered a niche outside of the typical scope of my day-to-day responsibilities1 that I’ve really enjoyed playing around with: text classification. For this post, I’ll walk through a real-world example that companies like Gong already seem to be doing: analyzing transcripts of sales calls for actionable insights.
Gong and other vendors clearly have more robust systems built out to do this kind of thing (eg. their Conversation Intelligence offering), but again…this is a small-scale, toy example of a real-world problem.
Libraries/other imports for this post are in the code block below. Let’s jump in!
Wait, Michael…are you…using Python and not Julia??
Yes, it finally happened…the guy who wouldn’t shut up about Julia finally gave in and started using Python.
I’ve found myself using Python a lot more at work and have found that many pain points I’ve encountered over the years are getting a lot better. Some of the frameworks that have made me feel a lot better about Python (not an exhaustive list) include:
I still love Julia (in fact, I wrote the prototype for this project using PromptingTools.jl), but I’ll probably be using Python a lot more for personal projects/posts/etc. in the future.
A Real World Example: Sales Objections
Let’s say that you’re a small sales team that records all of your sales calls with some third party service. This third party service gives you access to both the recording of the sales call, as well as the transcripts generated from that call. On a typical day, you may share selected snippets of that call with teammates or potentially re-watch the call to get some insights into the sales process. Or, if you’re feeling really ambitious, you might dump the transcript in ChatGPT and start asking questions.
This can be illuminating, but time consuming. Broadly speaking, it’s not realistic to ask the team to juggle making their calls and managing prospects as well as asking them watch hours of calls or scour transcritps to determine the answers to questions like: “what types of objections do we get?”, “what types of objections are the most frequent?”, etc.
What if we could take all of those sales transcripts and just have someone read all of them and answer these questions? Well, turns out we kind of can…
Our Transcript Dataset
For this post, I’ll use a synthetic dataset that I downloaded from huggingface. Let’s take a look at that dataset:
The format of this is a little strange in that conversations are stored across rows, where each column is a back-and-forth in that conversation.
Conversations can be pretty long - but for the purposes of this example, we’re only going to focus on a small subset of the back and forth.
A Slightly Better Format
Let’s take the dataframe we have and represent it as a basic data structure we’ll call a Transcript. The most basic representation here looks like this:
@dataclassclass Transcript:id: int text: str
The intention of storing this as a dataclass is so that we can pass information to the LLM from the transcript. In this case, what we want to pass in is pretty simple: we just have the text itself and an ID which we’ll just let correspond to the row of the dataframe that the text comes from.
In a real-world usecase, this class would be a bit more extensive depending on what data we could include to link to other systems or provide additional context to the LLM. For example, if calls are attached to a Deal in whatever CRM you’re using…we’d want to include that too so that further down the line we could link the structured output from the LLM to other systems we have.
As I mentioned before, this dataset is set up in a way that conversations occur across columns. We can write a function to take the transcripts from the dataframe and turn them into a list of Transcripts.
def convert_df_to_transcripts(df: DataFrame) -> List[Transcript]:""" Take the transcript dataframe and turn it into a list of `Transcript` objects. """ transcript_list = ( df[["0", "1", "2", "3", "4", "5", "6", "7", "8"]] .fillna("") .agg(" ".join, axis=1) .tolist() ) transcripts = [ Transcript(index, text) for index, text inenumerate(transcript_list) ]return transcriptsall_transcripts = convert_df_to_transcripts(transcripts_df)
Evaluating the Text using Basic ChatPromptTemplate
All the entries in all_transcripts look something like this2:
print(all_transcripts[1])
Transcript(id=1, text='Customer: Hi, Im interested in learning more about your health products. Salesman: Great! Im happy to help. Tell me, what specific health concerns do you have? Customer: Ive been experiencing digestive issues lately and Im looking for a solution. Salesman: I understand how frustrating that can be. Many of our customers have found relief with our digestive health supplements. Would you like me to provide more information? Customer: Ive tried different products before, but nothing seems to work. Im skeptical. Salesman: I completely understand your skepticism. Its important to find the right solution that works for you. Our digestive health supplements are backed by scientific research and have helped many people with similar issues. Would you be open to trying them? Customer: Im concerned about the potential side effects of the supplements. Are they safe? Salesman: Safety is our top priority. Our digestive health supplements are made with natural ingredients and undergo rigorous testing to ensure their safety and effectiveness. We can provide you with detailed information on the ingredients and any potential side effects. Would that help alleviate your concerns? Customer: Im still unsure. Can you share some success stories from your customers?')
Now, I’d like you all to imagine a day spent reading the text of these calls and manually classifying stuff and how miserable that would be 😄 but fear not, we’re going to do this at scale!
We’ll use LangChain to set up our transcript-reading system, and for this post we’ll use Anthropic’s Claude model to do the classification. I don’t have a particular reason for choosing Claude for this–but I do appreciate Anthropic’s approach to AI safety for what that’s worth.
I’ve written a prompt (following the Prompt Engineering Guide here) to help classify some of the common sales objections I’ve heard from various BD teams throughout my career, which I’ll provide to Python as a text file. This prompt is templated to allow us to pass in information from our Transcript classes.
We’ll also set a variable containing the specific Claude model that we want to use.
# get the promptwithopen("objections-prompt.txt", 'r') as prompt_file: prompt = prompt_file.read()# get the default Anthropic model (this is in my Quarto environment)DEFAULT_ANTHROPIC_MODEL = os.environ["ANTHROPIC_DEFAULT_MODEL"]
Important
In this post, I’m outlining the process and general approach one could use to do a project like this if you’ve got some Python skills and a need for this specific usecase.
I won’t be sharing the exact LLM prompt I’m using. If you’re interested in leveraging this approach or exploring how you could apply it to your own projects, I’d love to collaborate. Feel free to reach out and we can work together to get the results you’re looking for.
For this basic implementation, we can set up a ChatPromptTemplate that will pass the information needed to the LLM. This is exactly what it sounds like in LangChain (eg a prompt template for chat models). The user input of this template will reflect the information that we intend to pass to the LLM via the Transcript class we created above.
Next, we’ll set up a Chain - which LangChain describes as [a sequence] of calls - whether to an LLM, a tool, or a data preprocessing step. Basically, we provide a “template” and a model that the chain can use.
chain = prompt_template | ChatAnthropic(model = DEFAULT_ANTHROPIC_MODEL, temperature =0.0)
Now we can pass in one of our transcripts, and check the content that gets returned to verify that it looks how we’d expect.
{
"transcript_id": 1,
"objections": [
{
"objection_no": 1,
"objector_name": "Customer",
"objection_type": "other",
"objection_summary": "Customer is skeptical about the effectiveness of the product due to previous unsuccessful attempts to resolve digestive issues."
},
{
"objection_no": 2,
"objector_name": "Customer",
"objection_type": "other",
"objection_summary": "Customer is concerned about potential side effects of the health supplements."
}
]
}
Tah-dah–we’ve turned sales objections into a structured JSON response.
We can easily turn this into a pandas dataframe:
pd.DataFrame(json.loads(r.content)["objections"])
objection_no
objector_name
objection_type
objection_summary
0
1
Customer
other
Customer is skeptical about the effectiveness ...
1
2
Customer
other
Customer is concerned about potential side eff...
or really any other data structure we might need. Once we have the structured output from the LLM, we have a lot of flexibility in how we use it–but more on that later!.
A More LangChain-y Way to Do This
LangChain offers perhaps a better way to do this using Documents. This appears to me to offer something similar to our Transcript class above, but is more directly supported by LangChain and allows us to bulk process multiple documents in one chain invocation.
For fun, I’ll run through how that looks, starting by reworking our df -> Transcript conversion function. This is my first time giving the Documents method a try, so it’s probably gonna be a bit rough around the edges.
As I mentioned before, let’s rewrite our function to return List[Document]. Really, we just have to update the list comprehension to match how a Document looks.
from langchain_core.documents import Documentdef convert_df_to_documents(df: DataFrame) -> List[Document]:""" Take the transcript dataframe and turn it into a list of `Transcript` objects. """ transcript_list = ( df[["0", "1", "2", "3", "4", "5", "6", "7", "8"]] .fillna("") .agg(" ".join, axis=1) .tolist() ) transcripts = [ Document(page_content=text, metadata={"transcript_id": index}) for index, text inenumerate(transcript_list) ]return transcriptsall_transcripts = convert_df_to_documents(transcripts_df)example_transcripts = all_transcripts[0:10]
Now we have a list of documents, so let’s give LangChain’s way a try.
from langchain.chains.combine_documents import create_stuff_documents_chainwithopen("objections-prompt-documents.txt", 'r') as prompt_file: prompt = prompt_file.read()prompt_template = ChatPromptTemplate.from_template(prompt)chain = create_stuff_documents_chain(ChatAnthropic(model = DEFAULT_ANTHROPIC_MODEL, temperature =0.0), prompt_template)result = chain.invoke({"context":example_transcripts})json.loads(result)
{'transcript_id': 1,
'objections': [{'objection_no': 1,
'objector_name': 'Customer',
'objection_type': 'competitor',
'objection_summary': 'Customer mentions being approached by multiple financial advisors and is comparing options'}]}
Not the expected result…I suspect (but have not confirmed) that LangChain can’t by default interact with the Document metadata in the same way that it can if we use the original method with the Transcript class, so I’m unsure if this fits my usecase or if we just need to do more work to force it to match my usecase.
Wait, why is this exciting?
Now that we have structured data from these conversations, there’s a lot we can do here! Some examples that come to mind which aren’t necessarily just objections are:
turn the objections data into dashboard, enabling the sales team to keep a pulse on objections
analyze customer calls for problems the customer has mentioned, use this structured data as an input into a churn model
enhance product development by identifying feature gaps from customer calls
generate insights for sales/marketing alignment (eg. identify gaps in messaging)
for non-profits, analyze outreach calls with donors and identify common objections, messaging that works, or other common themes
and more!
It’s also exciting because it can scale up pretty quickly and become a legitimate, recurring operation. Although our objections case is a bit of a toy example, it could theoretically be productionized in a manner that might look something like the diagram below3.
Basically, the system:
Grabs transcripts from whatever sales tool and stores them for retrieval in a system like s3.
These transcripts can then be read and analyzed by the LLM.
Those objections (or other types of structured data we generate from the transcripts) can then be stored alongside the raw transcripts as JSON or some other format.
We could also create a table from those objections and store it in Redshift, which can be used to power sales intellligence in Tableau, Domo, or other BI tools.
Wrapping Up
I hope this post has provided a useful illustration of how to leverage tools like LangChain to extract value from data that may otherwise go untapped. If you’re interested in implementing a similar solution for your team or organization, or if you have questions about the process, feel free to reach out.
real transcripts are much longer than this since (depending on the stage the prospect is in) calls could be up to an hour long. But for a toy example these are fine.↩︎
the diagram uses s3 and Redshift as examples because I’m most familiar with the AWS system, but those obviously aren’t the vendors you could use.↩︎