Follow along in the Colab notebook
This Quick Start guide shows how to improve the performance of a Retrieval Augmented Generation (RAG) Finance Question & Answer chatbot by using PIs Query Fanout SDK to improve the queries to the finance database. Specifically, we will:
Here's the code notebook for this example if you'd like to review and play with the real code!
Before we start any modeling efforts, we will define what a high quality answer from our chatbot will look like, and encapsulate this with our Pi Scoring System
from withpi.types import ScoringSpec
summary_spec = ScoringSpec(
name = "Default",
description = "Summarizing Financial Documents to Answer Broad Queries\nThe Challenge of Financial Document Synthesis\nSummarizing financial documents to answer broad, complex queries is a demanding analytical task that requires several specialized skills:\n1. Financial Literacy and Contextual Understanding\nThe summarizer must possess strong financial literacy to interpret metrics, ratios, and terminology correctly within their proper business context. This includes understanding how different financial indicators relate to business performance and what they signal about a company's health.\n2. Cross-Document Integration\nFinancial insights often exist across multiple documents (quarterly reports, annual statements, investor presentations, industry analyses). The summarizer must connect related information scattered across these various sources to form a comprehensive picture.\n3. Temporal Analysis\nFinancial performance is inherently time-based. The summarizer must track performance trajectories, identify significant inflection points, distinguish between short-term fluctuations and long-term trends, and place current performance in historical context.\n4. Analytical Prioritization\nNot all financial information is equally relevant to the query at hand. The summarizer must filter through extensive data to extract and emphasize the most pertinent insights while appropriately contextualizing secondary factors that provide necessary background.\n5. Sector-Specific Contextualization\nFinancial metrics mean different things in different industries. The summarizer must adjust their interpretation based on sector-specific benchmarks, accounting for how metrics like margin expectations, growth rates, and valuation multiples vary across industries.\n6. Narrative Construction\nBeyond presenting data points, the summarizer must construct a coherent narrative that explains not just what happened but why it happened, connecting financial outcomes to strategic decisions, market conditions, and operational execution.\n7. Balancing Precision with Accessibility\nThe summarizer must maintain numerical accuracy while translating technical financial information into accessible insights, avoiding oversimplification while ensuring the summary remains intelligible to its intended audience.\n8. Objectivity and Balanced Perspective\nFinancial documents often present information with inherent corporate bias. The summarizer must maintain objectivity, balancing positive signals against risk factors, and providing a measured assessment that neither unduly emphasizes strengths nor disproportionately highlights weaknesses.\n9. Multi-Entity Comparative Analysis\nWhen queries involve multiple companies, the summarizer must provide fair comparisons that account for differences in company size, business model, market position, and reporting practices to deliver meaningful insights about relative performance.\n10. Query Relevance and Directness\nUltimately, the summary must directly address the original query, providing a concise yet comprehensive answer that synthesizes the most relevant financial information into a focused response tailored to the specific information need.\nThis complex analytical process transforms dense, technical financial documentation into targeted insights that directly address the information needs behind broad financial queries.",
dimensions = [
{
"label": "Comprehensiveness",
"description": "Comprehensiveness",
"sub_dimensions": [
{
"label": "Comprehensiveness",
"description": "Does the summary address each datapoint requested in the user prompt?",
"scoring_type": "PI_SCORER"
}
],
"weight": 0.3
},
{
"label": "Financial Understanding",
"description": "Financial Understanding",
"sub_dimensions": [
{
"label": "Financial Literacy",
"description": "Does the summary demonstrate a correct understanding of financial metrics and terminology?",
"scoring_type": "PI_SCORER"
},
{
"label": "Contextual Understanding",
"description": "Is the financial information interpreted within its proper business context?",
"scoring_type": "PI_SCORER"
}
],
"weight": 0.3
},
{
"label": "Integration and Analysis",
"description": "Integration and Analysis",
"sub_dimensions": [
{
"label": "Cross-Document Integration",
"description": "Does the summary integrate information from multiple financial documents?",
"scoring_type": "PI_SCORER"
},
{
"label": "Integration of Insights",
"description": "Are related financial insights connected to form a comprehensive picture?",
"scoring_type": "PI_SCORER"
},
{
"label": "Temporal Analysis",
"description": "Does the summary track financial performance over time?",
"scoring_type": "PI_SCORER"
},
{
"label": "Inflection Points",
"description": "Are significant inflection points in financial performance identified?",
"scoring_type": "PI_SCORER"
},
{
"label": "Trend Distinction",
"description": "Does the summary distinguish between short-term fluctuations and long-term trends?",
"scoring_type": "PI_SCORER"
}
],
"weight": 0.3
},
{
"label": "Narrative and Clarity",
"description": "Narrative and Clarity",
"sub_dimensions": [
{
"label": "Narrative Construction",
"description": "Does the summary construct a coherent narrative explaining financial outcomes?",
"scoring_type": "PI_SCORER"
},
{
"label": "Narrative Clarity",
"description": "Is the narrative clear in explaining why financial outcomes occurred?",
"scoring_type": "PI_SCORER"
},
{
"label": "Precision and Accessibility",
"description": "Is the summary numerically accurate while remaining accessible to the intended audience?",
"scoring_type": "PI_SCORER"
},
{
"label": "Avoiding Oversimplification",
"description": "Does the summary avoid oversimplifying technical financial information?",
"scoring_type": "PI_SCORER"
}
],
"weight": 0.3
},
{
"label": "Objectivity and Balance",
"description": "Objectivity and Balance",
"sub_dimensions": [
{
"label": "Objectivity",
"description": "Does the summary maintain objectivity and balance positive signals against risk factors?",
"scoring_type": "PI_SCORER"
},
{
"label": "Balanced Perspective",
# Good
"description": "Is a balanced perspective provided without undue emphasis on strengths or weaknesses?",
"scoring_type": "PI_SCORER"
}
],
"weight": 0.3
},
{
"label": "Comparative Analysis",
"description": "Comparative Analysis",
"sub_dimensions": [
{
"label": "Comparative Analysis",
"description": "Does the summary provide fair comparisons between multiple companies?",
"scoring_type": "PI_SCORER"
},
{
"label": "Multi-Entity Comparison",
"description": "Are differences in company size, business model, and market position accounted for in comparisons?",
"scoring_type": "PI_SCORER"
}
],
"weight": 0.3
},
{
"label": "Relevance and Focus",
"description": "Relevance and Focus",
"sub_dimensions": [
{
"label": "Query Relevance",
"description": "Does the summary directly address the original query?",
"scoring_type": "PI_SCORER"
},
{
"label": "Directness",
"description": "Is the summary concise yet comprehensive in answering the query?",
"scoring_type": "PI_SCORER"
},
{
"label": "Focused Response",
"description": "Is the response focused on synthesizing the most relevant financial information?",
"scoring_type": "PI_SCORER"
}
],
"weight": 0.3
},
{
"label": "Historical and Strategic Context",
"description": "Historical and Strategic Context",
"sub_dimensions": [
{
"label": "Historical Context",
"description": "Is current financial performance placed in historical context?",
"scoring_type": "PI_SCORER"
},
{
"label": "Strategic Decisions",
"description": "Are financial outcomes connected to strategic decisions, market conditions, and operational execution?",
"scoring_type": "PI_SCORER"
}
],
"weight": 0.3
}
]
)
Initializing our Scoring System
Imagine you're building an AI application that can answer complex questions about companies’ financial data. It's a lot of data; too much to fit in a model’s context window, so you're currently storing it in a simple vector database. Your initial approach to question and answer with this data is straightforward.
You can see the vector database setup step in the accompanying code notebook for this walkthrough
# Define the user's financial query about Quantum Dynamics revenue
user_query = "Get revenue for the last 2 years for Quantum Dynamics"
# Query the financial database to retrieve the relevant revenue data
retrieved_data = finance_database.query(user_query)
# Use the LLM to generate a natural language summary of the retrieved financial data
print(LLM.summarize(user_query, retrieved_data))
You try a complex query:
“How does OceanicFusion Energy's performance in water usage efficiency and carbon offset metrics compare to TerraSustain Agriculture's environmental impact metrics, and how do analysts view both companies' sustainability efforts?”
Your system answers:
"The provided documents do not contain specific information on OceanicFusion Energy's performance in water usage efficiency and carbon offset metrics. However, they do highlight some challenges the company faces, such as damage to installations from extreme…" (see full response)
You know for a fact this information exists in your database, but you try out a few more queries and keep encountering the same issues in the answers (see examples in code notebook):
This response also scores very low with our Pi Scoring System. It scores a 0.43 on a 1 point scale (see in code notebook)
A standard database simply can’t handle a query like the one you issued above. You need to break this query down into bite-sized pieces for your DB to handle. We’ve built a core SDK functionality just for this, and call these bite-sized pieces “Query Fanouts”
Pi’s Query Fanout endpoint transforms a single user question into multiple targeted natural language queries:
For example, the question...
“How does OceanicFusion Energy's performance in water usage efficiency and carbon offset metrics compare to TerraSustain Agriculture's environmental impact metrics, and how do analysts view both companies' sustainability efforts?”
...generates these natural language queries:
You update your approach to use Pi's Fanout SDK. Adding in a step to generate fanout queries for the original query, and then use each of those fanout queries to retrieve results from your database.
# Define the user's financial query about Quantum Dynamics revenue
user_query = "Get revenue for the last 2 years for Quantum Dynamics"
# Generate a list of granular fanouts to address the original user query
list_of_fanout_queries = withpi.generateFanouts(user_query)
# Using the fanouts, query the financial database to retrieve the relevant revenue data
retrieved_data = finance_database.query([list_of_fanout_queries])
# Use the LLM to generate a natural language summary of the retrieved financial data
print(LLM.summarize(user_query, retrieved_data))
Query How does OceanicFusion Energy's performance in water usage efficiency and carbon offset metrics compare to TerraSustain Agriculture's environmental impact metrics, and how do analysts view both companies' sustainability efforts? | |
Before Fanouts Bolding indicates incorrect claims about missing information | After Fanouts Bolding indicates correctly retrieved information |
---|---|
"The provided documents do not contain specific information on OceanicFusion Energy's performance in water usage efficiency and carbon offset metrics. However, they do highlight some challenges the company faces, such as damage to installations from extreme weather events, which raises concerns about the durability and sustainability of tidal power. Additionally, an analyst from UBS has given OceanicFusion a ""Neutral"" rating, noting that while tidal power is promising, it faces economic challenges compared to solar and wind energy, and the company needs additional capital to scale. In contrast, TerraSustain Agriculture's environmental impact metrics are more detailed in the documents. The company is noted for its precision farming technologies, which have led to significant water savings (40%), yield improvements (23%), and reductions in fertilizer use (27%). These metrics indicate a positive environmental impact, particularly in terms of water usage efficiency. TerraSustain's efforts have been positively received on social media, with high engagement and positive sentiment, suggesting strong support for their sustainability initiatives. Overall, TerraSustain Agriculture appears to have a more quantifiable and positively perceived impact on environmental sustainability compared to OceanicFusion Energy, which faces challenges in scaling and economic viability despite the potential of tidal power. Analysts view TerraSustain's sustainability efforts favorably, while OceanicFusion's sustainability efforts are seen as promising but economically challenging." | "OceanicFusion Energy and TerraSustain Agriculture both focus on sustainability, but they operate in different sectors and thus have distinct environmental impact metrics. OceanicFusion Energy:
TerraSustain Agriculture:
Analysts' Views on Sustainability Efforts:
In summary, while OceanicFusion Energy contributes significantly to carbon offsetting through renewable energy, TerraSustain Agriculture excels in water usage efficiency and carbon sequestration through its precision farming technologies. Analysts recognize both companies' sustainability efforts, with TerraSustain receiving slightly more favorable views due to its direct impact on agricultural efficiency and innovative programs." |
Total Pi Score (see details): .43 | Total Pi Score (see details): .67 |
You can see more side by side diffs and scores in the code notebook
Query What patterns emerge when comparing capacity utilization metrics across different industries - from VerticalHarvest's grow area utilization to FusionGrid's renewable capacity deployment to QuasarPharm's clinical trial enrollment rates? | |
Before Fanouts Bolding indicates incorrect claims about missing information | After Fanouts Bolding indicates correctly retrieved information |
---|---|
"The provided documents do not include information about VerticalHarvest or QuasarPharm, so I can only analyze the data available for FusionGrid Energy. Here's a summary of the capacity utilization metrics for FusionGrid Energy:
Overall, the pattern for FusionGrid Energy shows a strategic expansion in both renewable capacity and battery storage, coupled with maintaining high grid reliability. This suggests a focus on scaling up their operations while ensuring stability and reliability in energy distribution. Without data on VerticalHarvest and QuasarPharm, a direct comparison across different industries cannot be made." | "When comparing capacity utilization metrics across the different industries represented by VerticalHarvest Agritech, FusionGrid Energy, and QuasarPharm, several patterns emerge:
Overall Patterns:
These patterns highlight the importance of strategic capacity expansion and resource optimization across different industries to drive growth and maintain competitive advantages." |
Total Pi Score (see details): .65 | Total Pi Score (see details): 0.60 |
Query How do technology companies like CyberShield Solutions and MindSphere Technologies differ from healthcare companies like QuasarPharm and NanoHarvest Therapeutics in balancing R&D expenditures against short-term profitability, based on their financial metrics and analyst recommendations? | |
Before Fanouts Bolding indicates incorrect claims about missing information | After Fanouts Bolding indicates correctly retrieved information |
---|---|
"Based on the provided data, we can analyze MindSphere Technologies, a technology company, to understand its approach to balancing R&D expenditures against short-term profitability. However, there is no specific data on CyberShield Solutions, QuasarPharm, or NanoHarvest Therapeutics in the documents provided. Therefore, the analysis will focus solely on MindSphere Technologies. MindSphere Technologies:
General Insights:
In summary, technology companies like MindSphere Technologies balance R&D expenditures with short-term profitability by leveraging innovation and strategic partnerships, while healthcare companies may focus more on the long-term potential of their R&D investments due to the nature of the industry." | "Based on the provided data, we can observe some key differences in how technology companies like CyberShield Solutions and MindSphere Technologies balance R&D expenditures against short-term profitability compared to healthcare companies like QuasarPharm. CyberShield Solutions (Technology - Cybersecurity)
MindSphere Technologies (Technology - Artificial Intelligence)
QuasarPharm (Healthcare - Biopharmaceuticals)
NanoHarvest Therapeutics (Healthcare - Biopharmaceuticals)
Summary
In conclusion, technology companies tend to achieve a more balanced approach between R&D and profitability, while healthcare companies often accept short-term financial volatility in pursuit of long-term breakthroughs." |
Total Pi Score (see details): 0.67 | Total Pi Score (see details): 0.78 |
This Quick Start guide demonstrates how to dramatically improve the quality of a Finance Q&A bot by implementing Query Fanout to generate multiple targeted natural language queries for complex financial questions.
The result? A financial assistant that provides more accurate, comprehensive answers while still using your existing database of financial information.
Ready to go beyond the basics? More walkthroughs coming soon: