I’m working on a script to fetch comments from multiple Disqus forums using the Disqus API. My goal is to retrieve comments from the last specified number of hours up to the current time. However, I’m encountering an issue where the since parameter appears to work in the opposite way, fetching older comments rather than the recent ones.
Here’s a simplified version of my script:
import json
import requests
import pandas as pd
from datetime import datetime, timedelta
from ratelimit import limits, sleep_and_retry
# Constants for rate limiting
CALLS = 900
RATE_LIMIT = 3600
# API secrets for different forums
API_SECRETS = {
"example forum':"key"
}
# Rate limit decorator
@sleep_and_retry
@limits(calls=CALLS, period=RATE_LIMIT)
def check_limit(url):
return requests.get(url)
def fetch_comments(forum, hours=1):
api_secret = API_SECRETS[forum]
skeleton_url = f'https://disqus.com/api/3.0/forums/listPosts.json?forum={forum}&limit=100&related=thread&include=approved&order=desc&api_secret={api_secret}'
# Calculate the start time as a UNIX timestamp
time_threshold = datetime.utcnow() - timedelta(hours=hours)
time_threshold = int(time_threshold.timestamp())
url = f"{skeleton_url}&since={time_threshold}"
response = check_limit(url)
try:
comments = response.json()
except:
print('Error:', response.status_code)
return []
total_comments = comments["response"]
cursor = comments.get("cursor", {})
i = 1
while cursor.get("hasNext"):
print(f"Fetching comments for {forum}, page {i}, cursor: {cursor['next']}")
next_cursor = cursor['next']
next_url = f"{skeleton_url}&cursor={next_cursor}"
try:
response = check_limit(next_url)
comments = response.json()
if comments["code"] == 15:
time.sleep(3000)
response = check_limit(next_url)
comments = response.json()
if comments["code"] == 13:
print(comments)
print("Sleeping for 10 mins")
time.sleep(600)
else:
total_comments.extend(comments["response"])
i += 1
cursor = comments.get("cursor", {})
except Exception as e:
print(e)
time.sleep(60)
return total_comments
def main():
forums = ["example forum"]
for forum in forums:
print(f"Processing forum: {forum}")
comments = fetch_comments(forum, hours=1)
# Process and save comments
if __name__ == "__main__":
main()
The Issue:
When I set the hours parameter to 1, 2, 7, or 25, the fetched comments are all older than the specified timeframe, meaning they fall outside the range of “last X hours”. It appears that the since parameter fetches comments from the specified time and earlier, instead of from the specified time up to the current time.
Any help or guidance on how to correctly use the since parameter or another method to achieve the desired results would be greatly appreciated.
Thank you!
What I’ve Tried:
-
I ensured the timestamps are correctly converted to UNIX format.
-
I tested different values for the hours parameter to confirm the behavior.
-
I reviewed the Disqus API documentation, but the behavior is still unexpected.
My Goal:
I want to fetch comments from the last specified number of hours up to the current time. How can I modify my script or use the Disqus API parameters correctly to achieve this?