I have a Cassandra table containing over 450,000 rows. When attempting to fetch all the rows, I consistently receive slightly fewer results, for example, 448,546. For certain use cases, I need to retrieve the entire dataset in one go. What strategies or configurations can ensure that I can accurately fetch all rows from this large dataset? Any advice on handling large-scale data retrieval in Cassandra would be appreciated.
I am using Python’s cassandra-driver, I tried something like this:
from cassandra.cqlengine.models import Model
class User(Model):
username = Text(primary_key=True)
fullname = Text(required=True)
email = Text(required=True)
phone_number = Text(required=True)
password = Text(required=True)
address = Text()
birth_day = Date()
branch = Text()
if __name__ == '__main__':
all_data = User.objects.limit(500000)