New on this platform and a newbie to programming in general. Don’t know if anyone can help me with the following. I managed to build a simple ai voice chatbot that I can place on a website. You can open and close the chat window via the chat icon or descriptive tooltip next to it. It’s powered by OpenAI Whisper and GPT 3.5 Turbo. I found this to be a good combo when it comes to the cost. But you can swop out 3.5 turbo for Groq, but I did find the responses sub-optimal compared with 3.5 turbo. The bot works fine in voice and text, but I have encountered a few problems I cannot seem to solve. Firstly when you talk or text the bot and it starts it’s speech recognition and playback of the output.mp3 audio file, you can open and close the chat window via the chat icon or tooltip which enables the buttons. This allows the user to tap the buttons and interact with the chatbot while it’s busy with speech recognition or audio playback. In doing so the bot responds to it’s own speech or doubling up on the user’s text input or speech, layering everything on top of each other and creating an incoherent mess. I managed to sort out the problem when the bot is listening by introducing the isListening variable into the toggleListening function. Can someone look at the code and help me fix it so it works as it should. Here’s the index.html which is the frontend and the app.py which is the engine running the backend. I should also highlight that the problem is in the frontend, not the backend. The backend is working as it should:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<link rel="icon" type="image/svg+xml" href="/logo.svg" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Talking AI Assistant</title>
<style>
/* Styles for the floating chat icon and chat container */
/* Style for the floating chat icon */
#chatIcon {
position: fixed;
bottom: 20px;
right: 20px;
z-index: 9999;
background-color: #007bff;
color: #fff;
width: 50px;
height: 50px;
border-radius: 50%;
text-align: center;
line-height: 50px;
cursor: pointer;
font-size: 25px; /* Default size for desktop */
}
/* Tooltip style */
#chatTooltip {
font-size: 14px;
position: fixed;
bottom: 25px;
right: 85px; /* Adjust as needed */
background-color: #007bff;
color: #fff;
padding: 10px;
border-radius: 5px;
z-index: 9999;
width: 220px; /* Fixed width for the tooltip */
white-space: normal; /* Allow text to wrap */
display: flex; /* Use flexbox layout */
justify-content: center; /* Center horizontally */
align-items: center; /* Center vertically */
text-align: center; /* Align text within the tooltip */
cursor: pointer; /* Change cursor to pointer on hover */
}
#chatContainer {
display: none;
position: fixed;
bottom: 100px;
right: 50px;
z-index: 9999;
width: 25%;
max-width: 320px;
height: 550px;
overflow-y: auto;
overflow-x: hidden;
background-color: white;
color: black;
border-radius: 5px;
}
@media only screen and (max-width: 768px) {
#chatIcon {
width: 40px;
height: 40px;
line-height: 38px;
}
#chatContainer {
width: 300px;
height: 500px;
}
}
/* Style for the text input */
#textInput {
margin-top: 20px;
width: calc(100% - 20px);
padding: 8px;
box-sizing: border-box;
margin-bottom: 10px;
border: 1px solid #ccc;
border-radius: 4px;
resize: vertical;
overflow-y: auto;
height: 90px;
font-size: 14px;
}
body {
overflow-x: hidden;
margin: 0;
padding: 0;
}
/* Style for the processing indicator */
#processingIndicator {
display: none;
position: fixed;
top: 20px;
right: 160px;
z-index: 9999;
text-align: center;
margin-bottom: 10px;
}
/* Style for chat entries */
.chat-entry {
margin-bottom: 8px;
}
.user-text {
color: black; /* Example color for user's text */
word-wrap: break-word; /* Allow long words to wrap */
padding-left: 20px;
padding-right: 20px;
font-size: 14px;
}
.ai-text {
color: green; /* Example color for AI's text */
word-wrap: break-word; /* Allow long words to wrap */
padding-left: 20px;
padding-right: 20px;
font-size: 14px;
}
.chat-title {
padding-left: 20px;
padding-right: 20px;
padding-top: 20px;
padding-bottom: 10px;
font-weight: bold;
font-size: 18px;
}
.chat-paragraph {
padding-left: 20px;
padding-right: 20px;
font-size: 14px;
}
#startButtonContainer {
padding-left: 20px;
padding-top: 20px;
}
#startButton {
font-size: 14px;
background-color: #007bff;
padding: 10px;
border-radius: 6px;
color: white;
transition: background-color 0.3s ease; /* Adding transition for smooth effect */
}
#startButton:hover {
background-color: #0056b3; /* New background color on hover */
}
#textInput {
background-color: white;
}
#sendTextButton {
font-size: 14px;
background-color: #007bff;
padding: 10px;
border-radius: 6px;
color: white;
transition: background-color 0.3s ease; /* Adding transition for smooth effect */
margin-bottom: 20px;
}
#sendTextButton:hover {
background-color: #0056b3; /* New background color on hover */
}
</style>
</head>
<body>
<div id="root"></div>
<script type="module" src="/src/main.jsx"></script>
<!-- Floating chat icon -->
<div id="chatIcon">????</div>
<!-- Tooltip label -->
<div id="chatTooltip">Click here to speak to the AI assistant</div>
<!-- Chat container -->
<div id="chatContainer" class="shadow p-3 rounded">
<h5 class="chat-title">Talking AI Assistant</h5>
<p class="chat-paragraph">Speak to the assistant by clicking on "Start Listening" or type your message below and hit "Send".</p>
<div id="startButtonContainer">
<button id="startButton" class="btn btn-primary talk-btn">????️ Start Listening</button>
<textarea id="textInput" placeholder="Type your message..." rows="3"></textarea>
<button id="sendTextButton" class="btn btn-primary">Send</button>
</div>
<div id="echoText" class="mb-3"></div>
</div>
<!-- Processing indicator -->
<div id="processingIndicator">
<div class="spinner-border text-primary" role="status">
<span class="sr-only">Loading...</span>
</div>
<p>Processing...</p>
</div>
<script>
const chatIcon = document.getElementById('chatIcon');
const chatTooltip = document.getElementById('chatTooltip');
const chatContainer = document.getElementById('chatContainer');
const startButton = document.getElementById('startButton');
const textInput = document.getElementById('textInput');
const echoTextEl = document.getElementById('echoText');
const sendTextButton = document.getElementById('sendTextButton');
const processingIndicator = document.getElementById('processingIndicator');
let recognition = new (window.SpeechRecognition || window.webkitSpeechRecognition)();
recognition.lang = 'en-US';
recognition.interimResults = false;
recognition.maxAlternatives = 1;
recognition.continuous = true;
let isListening = false;
let chatOpened = false;
let listeningTimeout = null;
const TIMEOUT_DELAY_MS = 15000; // 15 seconds in milliseconds
document.addEventListener('DOMContentLoaded', startSession);
chatIcon.addEventListener('click', toggleChatContainer);
chatTooltip.addEventListener('click', toggleChatContainer); // Add click listener to chatTooltip
startButton.addEventListener('click', toggleListening);
sendTextButton.addEventListener('click', sendMessage);
textInput.addEventListener('keypress', function(event) {
if (event.key === 'Enter') {
sendMessage();
}
});
function toggleChatContainer() {
if (isListening) {
// If speech recognition is active, do nothing
return;
}
chatOpened = !chatOpened;
if (chatOpened) {
chatContainer.style.display = 'block';
// Re-enable buttons when chat container is opened
startButton.disabled = false;
sendTextButton.disabled = false;
if (resetButton) {
resetButton.disabled = false;
}
} else {
chatContainer.style.display = 'none';
recognition.stop();
}
}
function toggleListening() {
if (isListening) {
recognition.stop();
} else {
recognition.start();
}
}
function sendMessage() {
const text = textInput.value.trim();
if (text) {
processSpeech(text);
textInput.value = '';
// Disable buttons after sending message
startButton.disabled = true;
sendTextButton.disabled = true;
resetButton.disabled = true;
}
}
recognition.onstart = function () {
isListening = true;
startButton.disabled = true;
startButton.textContent = "Listening...";
sendTextButton.disabled = true;
resetButton.disabled = true;
// Set a timeout to stop recognition after TIMEOUT_DELAY_MS milliseconds
listeningTimeout = setTimeout(() => {
if (isListening) {
recognition.stop();
isListening = false;
startButton.textContent = "????️ Start Listening";
reEnableButtons();
resetButton.disabled = false;
recognition.stop();
isListening = false;
}
}, TIMEOUT_DELAY_MS);
};
recognition.onend = function () {
isListening = false;
startButton.textContent = "????️ Start Listening";
clearTimeout(listeningTimeout);
};
recognition.onresult = function (event) {
clearTimeout(listeningTimeout);
const currentResultIndex = event.resultIndex;
for (let i = currentResultIndex; i < event.results.length; ++i) {
if (event.results[i].isFinal) {
const transcript = event.results[i][0].transcript.trim();
startButton.disabled = true;
sendTextButton.disabled = true;
resetButton.disabled = true;
processSpeech(transcript);
recognition.stop();
break;
}
}
};
recognition.onerror = function (event) {
console.error('Speech recognition error', event.error);
clearTimeout(listeningTimeout);
resetButton.disabled = false;
reEnableButtons();
};
function startSession() {
// You can add any initialization logic here if needed
// Create and append the reset button
const resetButton = document.createElement('button');
resetButton.id = 'resetButton';
resetButton.className = 'btn btn-secondary';
resetButton.textContent = 'Enable Disabled Buttons';
startButtonContainer.appendChild(resetButton);
// Add event listener to the reset button
resetButton.addEventListener('click', function() {
startButton.disabled = false;
sendTextButton.disabled = false;
resetButton.disabled = false;
});
resetButton.style.display = 'none'; // Hide the reset button
}
function processSpeech(text) {
const userEntry = document.createElement('div');
userEntry.className = 'chat-entry user-text'; // Apply user-text class
userEntry.innerHTML = `<strong>You:</strong> ${text}`;
// Prepend the user's chat entry to the top of the chat container
echoTextEl.insertBefore(userEntry, echoTextEl.firstChild);
fetch('http://localhost:5000/process-speech', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({ text: text }),
})
.then(response => {
if (!response.ok) {
throw new Error('Network response was not ok');
}
return response.json();
})
.then(data => {
const aiText = data.response;
const aiEntry = document.createElement('div');
aiEntry.className = 'chat-entry ai-text';
aiEntry.innerHTML = `<strong>AI:</strong> ${aiText}`;
// Prepend the AI's chat entry to the top of the chat container
echoTextEl.insertBefore(aiEntry, echoTextEl.firstChild);
speak(aiText);
})
.catch(error => {
console.error('Error during AI processing:', error);
reEnableButtons();
});
}
function speak(text) {
processingIndicator.style.display = 'block';
startButton.disabled = true;
sendTextButton.disabled = true;
resetButton.disabled = true;
fetch('http://localhost:5000/synthesize-speech', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({ text: text }),
})
.then(response => {
if (!response.ok) {
throw new Error('Network response was not ok');
}
return response.blob();
})
.then(blob => {
const audioUrl = URL.createObjectURL(blob);
const audio = new Audio(audioUrl);
audio.onended = function () {
processingIndicator.style.display = 'none';
reEnableButtons();
recognition.start();
};
audio.play();
})
.catch(error => {
console.error('Error during TTS synthesis:', error);
reEnableButtons();
});
}
function reEnableButtons() {
startButton.disabled = false;
sendTextButton.disabled = false;
}
</script>
</body>
</html>
from flask import Flask, request, send_file, jsonify, render_template
from flask_cors import CORS
from openai import OpenAI
from groq import Groq
import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
import schedule
import threading
import time
import os
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv()
app = Flask(__name__)
CORS(app)
# Initialize Groq and OpenAI clients with environment variables
# client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
# openai_client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
# or use below. Enter your actual keys directly in between the double quotes.
# Initialize Groq and OpenAI clients with environment variables
client = OpenAI(api_key="")
openai_client = OpenAI(api_key="")
history_messages = []
# Function to save responses to a file
def save_to_file(user_text, ai_response):
try:
with open("chat_logs.txt", "a") as file:
file.write(f"User: {user_text}n")
file.write(f"AI: {ai_response}nn")
except Exception as e:
print("Error writing to file:", e)
# Initial message from the chatbot
INIT_MESSAGE = {
"role": "assistant",
"content": """You are a helpful assistant ready to answer any of the user's questions.""",
}
history_messages.append(INIT_MESSAGE)
@app.route('/synthesize-speech', methods=['POST'])
def synthesize_speech():
data = request.json
text = data['text']
sound = "output.mp3"
voice_response = openai_client.audio.speech.create(
model="tts-1",
voice="alloy",
input=text,
)
voice_response.stream_to_file(sound)
return send_file(sound, mimetype="audio/mpeg")
@app.route('/process-speech', methods=['POST'])
def process_speech():
data = request.json
user_text = data['text']
history_messages.append({"role": "user", "content": user_text})
completion = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=history_messages,
)
ai_response = completion.choices[0].message.content
history_messages.append({"role": "assistant", "content": ai_response})
# Save user input and AI response to file
save_to_file(user_text, ai_response)
return jsonify({'response': ai_response})
@app.route('/start-speech', methods=['POST'])
def start_speech():
global history_messages
history_messages = [] # Reset message history
history_messages.append(INIT_MESSAGE) # Re-append the initial message
return jsonify({'response': 'OK'})
def send_email():
sender_email = os.getenv("EMAIL_SENDER")
receiver_email = os.getenv("EMAIL_RECEIVER")
api_key = os.getenv("EMAIL_API_KEY")
message = MIMEMultipart()
message["From"] = sender_email
message["To"] = receiver_email
message["Subject"] = "Daily Chat Logs"
# Read the content of chat_logs.txt
with open("chat_logs.txt", "r") as file:
body = file.readlines() # Read lines to preserve formatting
# Use HTML formatting for the email body
html_body = "<html><body>"
for line in body:
html_body += f"<p>{line}</p>" # Wrap each line in a paragraph tag
html_body += "</body></html>"
message.attach(MIMEText(html_body, "html"))
# Send email using SMTP
try:
server = smtplib.SMTP('smtp.elasticemail.com', 587)
server.starttls()
server.login(sender_email, api_key)
server.sendmail(sender_email, receiver_email, message.as_string())
print("Email sent successfully!")
# Clear the chat_logs.txt file after sending email
with open("chat_logs.txt", "w") as file:
file.truncate(0)
print("Chat log file cleared successfully!")
except Exception as e:
print("Error sending email:", e)
finally:
server.quit()
# Schedule email sending once a day
schedule.every().day.at("17:51").do(send_email) # Adjust the time as needed
# Route for serving the index.html file
@app.route('/')
def index():
return render_template('index.html')
if __name__ == '__main__':
# Start the Flask app in a separate thread without the reloader
threading.Thread(target=app.run, kwargs={'host': '0.0.0.0', 'port': 5000, 'debug': True, 'use_reloader': False}).start()
# Run the schedule in the main thread
while True:
schedule.run_pending()
time.sleep(1)
I tried using Chatgpt to help me with solving the problem, but it might fix 1 thing and break another, then completely break the app. I tried llama3 and mixtral, but none of those models could help. I tried numerous different combinations and concoctions of code I thought might solve it. Nothing worked.
Zoltan Abonyi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.