I’m trying to use Gemini 1.5 Flash to generate JSON data from text documents. Originally, I was running into the following issues related to copyright:
Unexpected error during attempt 5: Cannot get the response text.
Cannot get the Candidate text.
Response candidate content has no parts (and thus no text). The candidate is likely blocked by the safety filters.
Content:
{}
Candidate:
{
"finish_reason": "RECITATION",
"safety_ratings": [
{
"category": "HARM_CATEGORY_HATE_SPEECH",
"probability": "NEGLIGIBLE",
"probability_score": 0.18271701,
"severity": "HARM_SEVERITY_NEGLIGIBLE",
"severity_score": 0.16000336
},
{
....
To adjust, I tweaked some of the model’s settings like the temperature, top_p, and the prompt itself, and I was able to eliminate the issue locally,
However, while moving to production and running the app in my docker container, the model starts to bring up this recitatation issue constantly each time
I am unsure of why this is the case, I tried tweaking the settings and my prompt again, but I’ve been scratching my head regarding why it suddenly fails when switching to containerization.
Below is my code:
MODEL PROMPT FUNCTION
def get_model_output(prompt, filepaths):
vertexai.init(project=os.getenv("PROJECT_ID"), location="us-central1")
model = GenerativeModel(model_name="gemini-1.5-flash-001")
parts = [Part.from_uri(filepath["gcs_link"], mime_type=filepath["mimetype"]) for filepath in filepaths]
parts.append(prompt)
# Create a GenerationConfig object with the specified parameters
generation_config = GenerationConfig(
temperature=0.5,
top_p=0.5,
max_output_tokens = 5000
# You can add other parameters here as needed
)
safety_settings = [
SafetySetting(
category=HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
threshold=HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
),
SafetySetting(
category=HarmCategory.HARM_CATEGORY_HARASSMENT,
threshold=HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
),
# Add other categories as needed
]
for attempt in range(2):
print(f'Attempt {attempt + 1} to get model output')
try:
token_count = model.count_tokens(parts)
print(f"Token count: {token_count}")
response = model.generate_content(
parts,
generation_config=generation_config,
safety_settings = safety_settings
)
json_match = re.search(r'{[sS]*}', response.text)
if json_match:
json_string = json_match.group(0)
try:
study_guide_data = json.loads(json_string)
print("DEBUG CONTENT SUCCESS: Successfully Generated content and parsed JSON")
return json_string
except json.JSONDecodeError as e:
print(f"JSON decode error: {e}")
print("Extracted JSON string:", json_string)
# Continue to next attempt
else:
print("No JSON object found in the response")
except Exception as e:
error_message = str(e)
truncated_message = error_message[:197] + "..." if len(error_message) > 200 else error_message
print(f"Unexpected error during attempt {attempt + 1}: {truncated_message}")
# If we reach here, the current attempt failed. We'll continue to the next one.
print(f"Attempt {attempt + 1} failed. Moving to next attempt.")
if attempt < 4: # Don't sleep after the last attempt
print("Waiting 5 seconds before next attempt...")
time.sleep(5)
# If we've exhausted all attempts without success
raise Exception("Failed to get valid JSON after 5 attempts")
PROMPT
base_prompt = f"""
<PRELUDE>
The user is a student in {grade_level_prompt}, they may upload one or more documents, please use all of them as necessary when generating the prompt
</PRELUDE>
<OBJECTIVE_AND_PERSONA>
You are a student in {grade_level_prompt} and are studying for an exam
</OBJECTIVE_AND_PERSONA>
<CONSTRAINTS>
Further constraints:
1. Read all the documents carefully and understand them.
2. Read every Lesson in every document
3. Prioritize all Lessons equally
4. DO NOT PLAGARIZE ANY OF THE MATERIALS FROM THE DOCUMENTS OR FROM ONLINE RESOURCES, write original content while still producing engaging study materials
5. DO NOT USE SOURCES FROM THE INTERNET, ONLY use information from the documents to generate study materials
Again, a repeat of the last two points:
DO NOT USE THE INTERNET
DO NOT PLAGARIZE
</CONSTRAINTS>
"""
#CALL 1: FLASHCARDS
if flashcards:
flashcard_prompt = f"""
<INSTRUCTIONS>
To complete the task, you need to do the following:
Group together related information across the documents in a cohesive summary. Do not include this summary, but refer to it when creating the flashcards
DO NOT PLAGARIZE, DO NOT DIRECTLY TAKE QUESTIONS OR PHRASES FROM THE DOCUMENTS, rather use the documents as a reference when creating the summary and flashcards
You need to come up with {numCards} flash cards. The flash cards should explain key concepts from the documents.
Output the flashcards in the following JSON format:
{{
"flashcards": [
{{
"front": "Question or concept goes here",
"back": "Answer or explanation goes here"
}},
{{
"front": "Another question",
"back": "Another answer"
}}
]
}}
Important:
1. Use exactly this JSON structure.
2. The "flashcards" array should contain {numCards} objects, each representing a flashcard.
3. Each flashcard object must have "front" and "back" keys.
4. Ensure the JSON is valid - use double quotes for strings and keys.
5. Do not include any explanatory text outside the JSON structure.
6. The entire output should be valid JSON that can be parsed by a JSON parser.
</INSTRUCTIONS>
"""
prompt = base_prompt + flashcard_prompt
DOCKER
FROM python:3.9-slim
#Python versions should be bumped up
#3.12, or latest update
WORKDIR /app
COPY requirements.txt .
COPY .env .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8080
# Command to run the application
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8080"]