Apologies if any of my terminology is off I’m relatively new to dev, I am trying to create an Azure Function that is triggered by an HTTP request that sends the video_id of a YouTube video to the function and then that function proceeds to pull the transcript of that video using the youtube_transcript_api library into a string and proceeds to use that transcript string for other purposes. I am running into an issue where when I am debugging locally and send a request to localhost with the video_id the YouTube API is able to pull the transcript without any issue however when I deploy the function into Azure I am receiving this error
Result: Failure Exception: TranscriptsDisabled: Could not retrieve a transcript for the video https://www.youtube.com/watch?v=X_uLehv5lOI! This is most likely caused by: Subtitles are disabled for this video If you are sure that the described cause is not responsible for this error and that a transcript should be retrievable, please create an issue at https://github.com/jdepoix/youtube-transcript-api/issues. Please add which version of youtube_transcript_api you are using and provide the information needed to replicate the error. Also make sure that there are no open issues which already describe your problem!
Using the link you can see that the video_id was sent in properly and I made sure of this by logging the video_id to ensure there werent any issues with the HTTP Request. I would venture to guess this is probably an issue with the YouTube API blocking traffic from cloud services, not sure if there is a way around this or if there is another service that wouldnt have this issue. Let me know if anyone else has run into this issue!
Here is what the relevant portion of my code looks like…
import azure.functions as func
import logging
app = func.FunctionApp(http_auth_level=func.AuthLevel.FUNCTION)
@app.route(route="azure_function/{video_id}")
def azure_function(req: func.HttpRequest) -> func.HttpResponse:
from openai import OpenAI
import json
from youtube_transcript_api import YouTubeTranscriptApi
video_id = req.route_params.get('video_id', None)
#Input video_id and output a string transcription of audio
def transcribe_video(video_id):
transcript_dict = YouTubeTranscriptApi.get_transcript(video_id)
transcript = ""
for line in transcript_dict:
transcript += line['text'] + " "
return transcript
logging.info("... starting transcription...")
video_transcription = transcribe_video(video_id)
I have tried logging the video_id to make sure that the is was being passed through the HTTP Request correctly and I have also tried running the function locally which does work but when I deploy into Azure and run an HTTP Request I am running into the above 500 error. I appreciate any help!
Personal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.