Context
I would like to create an GPT Agent. This Agent should be able to fetch data from 3 predefined CSV files. Hence I would like to create 3 function tools for each CSV file.
Expectation
AI Agent should understand what does each CSV do exactly. And what are the columns and data exactly agent can get from each CSV file.
Because, I think providing the CSV structure and column definitions when creating the tool will make the LLM result much more efficient and accurate.
Problem Statement
How to properly define the CSV structure and column definitions?
The code is I tried is listed below.
csv1_inspection_tool = StructuredTool.from_function(
func=get_first_n_rows,
name="InspectCSVFile",
description="Explore the contents and structure of a table document, displaying its column names and the first n rows, with n defaulting to 3.",
)
import pandas as pd
def get_csv_filename(
filename: str
) -> str:
"""Get CSV file name"""
# Read the CSV file
csv_file = pd.read_csv(filename)
# Since there's no sheet name, we just return the filename
return f"The file name of the CSV is '{filename}'"
def get_column_names(filename: str) -> str:
"""Get all column names from a CSV file"""
# Read the CSV file
df = pd.read_csv(filename)
column_names = 'n'.join(df.columns.to_list())
result = f"The File '{filename}' has columns:nn{column_names}"
return result
def get_first_n_rows(
filename: str,
n: int = 3
) -> str:
"""Get CSV File First N Lines"""
result = get_csv_filename(filename) + "nn"
result += get_column_names(filename) + "nn"
df = pd.read_csv(filename)
n_lines = 'n'.join(
df.head(n).to_string(index=False, header=True).split('n')
)
result += f"This file '{filename}' has first {n} lines of sample:nn{n_lines}"
return result