Problem:
There are problems with the code chat_with_llm.py I think, that send the json in the wrong format or something, history and max_token which I’m not sure how to solve.
original code from here
qwen_server.py
This is my code for starting the flask server
import os
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
import json
from flask import Flask, request, jsonify
# Set CUDA device
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# Model path configuration
model_path = "E:/.cache/Pretrain_models/qwen/Qwen-7B-Chat"
# Initialize tokenizer and model without manually assigning it to a device
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True, load_in_8bit=True)
model.generation_config = GenerationConfig.from_pretrained(model_path)
def predict_model(data):
"""
Generates a response based on the input data.
"""
text = data["message"][0]["content"]
inputs = tokenizer(text, return_tensors='pt').to(device)
outputs = model.generate(**inputs, max_new_tokens=data["max_tokens"], top_k=data["top_k"], top_p=data["top_p"],
temperature=data["temperature"], repetition_penalty=data["repetition_penalty"],
num_beams=data["num_beams"])
response = tokenizer.decode(outputs[0][len(inputs["input_ids"][0]):], skip_special_tokens=True)
return response
app = Flask(__name__)
@app.route("/generate", methods=["POST", "GET"])
def generate():
"""
Flask endpoint to generate model responses.
"""
try:
data = json.loads(request.data)
print(data)
res = predict_model(data)
label = "success"
except Exception as e:
res = ""
label = "error"
print(e)
return jsonify({"output": [res], "status": label})
if __name__ == '__main__':
app.run(port=3001, debug=False, host='0.0.0.0') # Allows external network access.
Terminal:
when I asked “感冒是什么” in the chat_with_llm.py terminal, it respond this
Expecting value: line 1 column 1 (char 0)
127.0.0.1 - - [19/May/2024 16:15:30] "GET /generate HTTP/1.1" 200 -
{'message': [{'role': 'user', 'content': '你叫啥'}], 'history': []}
'max_tokens'
127.0.0.1 - - [19/May/2024 16:15:45] "POST /generate HTTP/1.1" 200 -
{'message': [{'role': 'user', 'content': '你叫啥'}], 'history': []}
'max_tokens'
127.0.0.1 - - [19/May/2024 16:15:45] "POST /generate HTTP/1.1" 200 -
{'message': [{'role': 'user', 'content': '你叫啥'}], 'history': []}
'max_tokens'
127.0.0.1 - - [19/May/2024 16:15:45] "POST /generate HTTP/1.1" 200 -
{'message': [{'role': 'user', 'content': '你叫啥'}], 'history': []}
'max_tokens'
127.0.0.1 - - [19/May/2024 16:15:45] "POST /generate HTTP/1.1" 200 -
{'message': [{'role': 'user', 'content': '你叫啥'}], 'history': []}
'max_tokens'
127.0.0.1 - - [19/May/2024 16:15:45] "POST /generate HTTP/1.1" 200 -
{'message': [{'role': 'user', 'content': '你叫啥'}], 'history': []}
'max_tokens'
127.0.0.1 - - [19/May/2024 16:15:45] "POST /generate HTTP/1.1" 200 -
{'message': [{'role': 'user', 'content': '你叫啥'}], 'history': []}
'max_tokens'
127.0.0.1 - - [19/May/2024 16:15:45] "POST /generate HTTP/1.1" 200 -
{'message': [{'role': 'user', 'content': '你叫啥'}], 'history': []}
'max_tokens'
127.0.0.1 - - [19/May/2024 16:15:45] "POST /generate HTTP/1.1" 200 -
{'message': [{'role': 'user', 'content': '你叫啥'}], 'history': []}
'max_tokens'
127.0.0.1 - - [19/May/2024 16:15:45] "POST /generate HTTP/1.1" 200 -
{'message': [{'role': 'user', 'content': '你叫啥'}], 'history': []}
'max_tokens'
127.0.0.1 - - [19/May/2024 16:15:45] "POST /generate HTTP/1.1" 200 -
{'message': [{'role': 'user', 'content': '你叫啥'}], 'history': []}
'max_tokens'
127.0.0.1 - - [19/May/2024 16:15:45] "POST /generate HTTP/1.1" 200 -
{'message': [{'role': 'user', 'content': "请判定问题:感冒是什么所提及的是感冒的哪几个信息,请从['预防措施', '治疗方式', '名称', '治疗周期', '治愈概率', '疾病病因', '治疗科室', '疾病简介', '易感人群', '推荐食谱', '忌吃', '宜吃', '常用药品', '生产药品', '好评药品', '诊断检查', '症状', '并发症', '所属科室']中进行选择,并以列表形式返回。"}], 'history': []}
'max_tokens'
127.0.0.1 - - [19/May/2024 16:22:18] "POST /generate HTTP/1.1" 200 -
{'message': [{'role': 'user', 'content': "请判定问题:感冒是什么所提及的是感冒的哪几个信息,请从['预防措施', '治疗方式', '名称', '治疗周期', '治愈概率', '疾病病因', '治疗科室', '疾病简介', '易感人群', '推荐食谱', '忌吃', '宜吃', '常用药品', '生产药品', '好评药品', '诊断检查', '症状', '并发症', '所属科室']中进行选择,并以列表形式返回。"}], 'history': []}
'max_tokens'
127.0.0.1 - - [19/May/2024 16:22:18] "POST /generate HTTP/1.1" 200 -
{'message': [{'role': 'user', 'content': "请判定问题:感冒是什么所提及的是感冒的哪几个信息,请从['预防措施', '治疗方式', '名称', '治疗周期', '治愈概率', '疾病病因', '治疗科室', '疾病简介', '易感人群', '推荐食谱', '忌吃', '宜吃', '常用药品', '生产药品', '好评药品', '诊断检查', '症状', '并发症', '所属科室']中进行选择,并以列表形式返回。"}], 'history': []}
'max_tokens'
127.0.0.1 - - [19/May/2024 16:22:18] "POST /generate HTTP/1.1" 200 -
{'message': [{'role': 'user', 'content': "请判定问题:感冒是什么所提及的是感冒的哪几个信息,请从['预防措施', '治疗方式', '名称', '治疗周期', '治愈概率', '疾病病因', '治疗科室', '疾病简介', '易感人群', '推荐食谱', '忌吃', '宜吃', '常用药品', '生产药品', '好评药品', '诊断检查', '症状', '并发症', '所属科室']中进行选择,并以列表形式返回。"}], 'history': []}
'max_tokens'
127.0.0.1 - - [19/May/2024 16:22:18] "POST /generate HTTP/1.1" 200 -
{'message': [{'role': 'user', 'content': "请判定问题:感冒是什么所提及的是感冒的哪几个信息,请从['预防措施', '治疗方式', '名称', '治疗周期', '治愈概率', '疾病病因', '治疗科室', '疾病简介', '易感人群', '推荐食谱', '忌吃', '宜吃', '常用药品', '生产药品', '好评药品', '诊断检查', '症状', '并发症', '所属科室']中进行选择,并以列表形式返回。"}], 'history': []}
'max_tokens'
127.0.0.1 - - [19/May/2024 16:22:18] "POST /generate HTTP/1.1" 200 -
{'message': [{'role': 'user', 'content': "请判定问题:感冒是什么所提及的是感冒的哪几个信息,请从['预防措施', '治疗方式', '名称', '治疗周期', '治愈概率', '疾病病因', '治疗科室', '疾病简介', '易感人群', '推荐食谱', '忌吃', '宜吃', '常用药品', '生产药品', '好评药品', '诊断检查', '症状', '并发症', '所属科室']中进行选择,并以列表形式返回。"}], 'history': []}
'max_tokens'
127.0.0.1 - - [19/May/2024 16:22:18] "POST /generate HTTP/1.1" 200 -
{'message': [{'role': 'user', 'content': "请判定问题:感冒是什么所提及的是感冒的哪几个信息,请从['预防措施', '治疗方式', '名称', '治疗周期', '治愈概率', '疾病病因', '治疗科室', '疾病简介', '易感人群', '推荐食谱', '忌吃', '宜吃', '常用药品', '生产药品', '好评药品', '诊断检查', '症状', '并发症', '所属科室']中进行选择,并以列表形式返回。"}], 'history': []}
'max_tokens'
127.0.0.1 - - [19/May/2024 16:22:18] "POST /generate HTTP/1.1" 200 -
{'message': [{'role': 'user', 'content': "请判定问题:感冒是什么所提及的是感冒的哪几个信息,请从['预防措施', '治疗方式', '名称', '治疗周期', '治愈概率', '疾病病因', '治疗科室', '疾病简介', '易感人群', '推荐食谱', '忌吃', '宜吃', '常用药品', '生产药品', '好评药品', '诊断检查', '症状', '并发症', '所属科室']中进行选择,并以列表形式返回。"}], 'history': []}
chat_with_llm.py
Stackoverflow won’t let me send the code, it said it might be spam, idk why.
I think the problem occur at here
Terminal:
It’s the terminal where I enter my question
model init finished ......
USER INPUT:感冒是什么
step1: linking entity.....
step2:recall kg facts....
<Response [200]>
request error 'history'
<Response [200]>
request error 'history'
<Response [200]>
request error 'history'
<Response [200]>
request error 'history'
<Response [200]>
request error 'history'
<Response [200]>
request error 'history'
<Response [200]>
request error 'history'
<Response [200]>
request error 'history'
<Response [200]>
request error 'history'
<Response [200]>
request error 'history'
<Response [200]>
request error 'history'
["请判定问题:感冒是什么所提及的是感冒的哪几个信息,请从['预防措施', '治疗方式', '名称', '治疗周期', '治愈概率', '疾病病因', '治疗科室', '疾病简介', '易感人群', '推荐食谱', '忌吃', '宜吃', '常用药品', '生产药品', '好评药品', '诊断检查', '症状', '并发症', '所属科室']中进行选择,并以列表形式返回。", '', set()]
MATCH p=(m:Disease)-[r*..1]-(n) where m.name = '感冒' return p
0 []
step3:generate answer...
<Response [200]>
request error 'history'
<Response [200]>
request error 'history'
<Response [200]>
request error 'history'
<Response [200]>
request error 'history'
<Response [200]>
request error 'history'
<Response [200]>
request error 'history'
<Response [200]>
request error 'history'
<Response [200]>
request error 'history'
<Response [200]>
request error 'history'
<Response [200]>
request error 'history'
<Response [200]>
request error 'history'
KGRAG_BOT OUTPUT:
1