I have been using Pydantic to define schemas for genAi data outputs with Langchain. Creating Nested objets is simple enough, here is a mock version of my schema:
class Models(BaseModel):
people: list[People] = Field(description="names of the people related fields mentioned in the text.")
class People(BaseModel):
name: str = Field(description="name of the person being discussed")
title: str = Field(description="job title of the person being discussed")
And we get a nice response consistently to the effect of:
{
"people": [
{
"name": "John",
"title": "black-smith",
},
{
"name": "Jane",
"title": "deer hunter",
},
]
}
What I would like to do is define a nested field type here that further defines the information on each field given. I find this hard to describe but the output I am looking for is:
{
"people": [
{
"name": {"value": "John", "context": "his name was John,"},
"title": {"value": "black-smith", "context": "he was a black-smith"},
},
{
"name": {"value": "Jane", "context": "Jane like to hunt deer in her spare time"},
"title": {"value": "deer hunter", "context": "Jane like to hunt deer in her spare time"},
},
]
}
Now getting the models to output this when manually writing prompts is simple, and it does a good job providing context and everything consistently. But the best I can come up with for how to structure this in Pydantic is the tedious and repetitive:
class Models(BaseModel):
people: list[People] = Field(description="names of the people related fields mentioned in the text.")
class People(BaseModel):
name: dict = Field(description="an object with 2 keys. The first being 'value' which includes the name of the person being discussed, and the second being 'context' which includes the snippet from the text where the value was taken from")
title: dict = Field(description="an object with 2 keys. The first being 'value' which includes the job title of the person being discussed, and the second being 'context' which includes the snippet from the text where the value was taken from")
I am looking for a way to define a Field type in Pydantic that abstracts this, something like:
class Models(BaseModel):
people: list[People] = Field(description="names of the people related fields mentioned in the text.")
class People(BaseModel):
name: dict = NestedField(description="the name of the person being discussed")
title: dict = NestedField(description="the job title of the person being discussed")
class NestedField:
value: str = Field(description="the value of the field (using the parent description)")
context: str = Field(description="snippet of text from the text where the value was found")
Any advice would be appreciated, thank you
ChrisFrankoPhD is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.