I need to create a web service that executes every hour. It will be used to review data in a database and add alerts to a table in the same database if certain conditions are met/not met. What we currently have is:
We have end devices that use Python to report to an Amazon Web Services (AWS) virtual server. The AWS server takes that information and stores it in a MySQL database. The AWS server is Linux running Django and Apache. I need to be able to have some python code run every hour that verifies the data that has been stored by the end devices. If certain conditions are not met then a record will be added to the alerts
table in the database.
We originally contracted to have the above setup created. I am new to Python, Django, and Apache. However, I have already made several changes to the Python code that sends and also receives the data from the end devices. I am a coder that is breaking into web programming.
Does anyone have any recommendations on how I can do this?
2
How about making a cronjob, assuming you have shell access?
The cron daemon exists on virtually any UNIX-like system and schedules commands to run based on a description in a file called the crontab.
Each line of the file contains a set of fields to indicate the timepoints when a command shall be executed.
Your task could be either a standalone program that does the task you wish to accomplish or as another answer suggests, an invocation of a HTTP client like wget, curl or fetch to access a web resource that will perform the action.
If you’ve got limits for how long a request may take to serve, you might have to move the task into an offline script or program that doesn’t run inside your web framework/server.
1
With Django on AWS, I’d look into Celery.
Celery adds asynchronous tasks and includes a scheduler, and on AWS you can configure Celery to use the Amazon Simple Queue Service as the broker (see Celery with Amazon SQS on Stack Overflow and this blog post on the subject).
You set up a Celery periodic task schedule and it’ll run a configured task according to that schedule.
Advantage is that you can use the whole setup to run any asynchronous task, offloading heavy tasks from your web server to the Celery workers.
The light-weight alternative is to just set up a crontab job; you could even configure a route in your Django application to be called using curl
or wget
:
* 5 * * * curl http://username:password@hostname/route_to_job
1
Since your AWS instance runs Linux, you can probably accomplish this as a cron
job.
You could take what I would term Drupal’s cron approach which, in the case of Django, involves creating a controller to respond to a URL and then perform the action you want.
You then configure a cron task to curl
the controller’s URL, triggering your script.
This has the added advantage of being easily callable at any time from a URL: maybe an impatient manager wants a report generated from data from last 20 minutes.
You may want to look into APScheduler. This is a quartz-like scheduler (although not as extensive) for Python.
This can be a far better alternative to externally run cron scripts for long-running applications (e.g. web applications), as it is platform neutral and can directly access your application’s variables and functions.
You can find a description for the most recent release here:
http://pypi.python.org/pypi/APScheduler/2.0.3
There is some documentation for installation/implementation here:
https://apscheduler.readthedocs.org/en/latest/
Here are some of the features:
- No (hard) external dependencies
- Thread-safe API
- Excellent test coverage (tested on CPython 2.5 – 2.7, 3.3, Jython 2.5.3, PyPy 1.9)
- Configurable scheduling mechanisms (triggers):
- Cron-like scheduling
- Delayed scheduling of single run jobs (like the UNIX “at” command)
- Interval-based (run a job at specified time intervals)
- Multiple, simultaneously active job stores:
- RAM
- File-based simple database (shelve)
- SQLAlchemy (any supported RDBMS works)
- MongoDB
- Redis
I think, you can use some stuff like a django-extensions
Django-Extensions Website
There is a module – jobs. For me this is a very good tool to control your cronjobs.
Second option is use a Fabric and create a function for this.
And I see third way. Use your imagination and knowledge and create own function
with subprocess and sh.
The most likely the answer here is provided that you are running crontab with the default setting which means it is displaying the crontab SPOOL RATHER THAN THE FILES.
Meanwhile, it can be run also in AWS as it comes with cron pre-installed and configured, that allows for set up of a task that should be run hourly, daily, weekly or monthly as well as any other time period by putting files in a /etc/cron.xxxxxx
directory as explained here.
Setting up a job to run hourly, daily, weekly or monthly is very quick. Since the question is asking how to execute code every hour so in AWS Linux you can create a file in /etc/cron.hourly
.
Here is the step when you have Log in to your instance via the SSH client.
$ echo "/usr/bin/python -q /path/to/your/file" > application
$ sudo mv application /etc/cron.hourly/
$ sudo chown -R root /etc/cron.hourly
$ sudo chmod 2755 /etc/cron.hourly/application
$ sudo /etc/init.d/crond restart
In the example above the file saved and named ‘application’. The name doesn’t really matter so long as it is unique. This is the log report by running: $ sudo vim /var/log/cron
Dec 28 19:01:01 ip-xxx-xx-xx-xx CROND[20243]: (root) CMD (run-parts /etc/cron.hourly)
Dec 28 19:01:01 ip-xxx-xx-xx-xx run-parts(/etc/cron.hourly)[20243]: starting 0anacron
Dec 28 19:01:01 ip-xxx-xx-xx-xx run-parts(/etc/cron.hourly)[20261]: finished 0anacron
Dec 28 19:01:01 ip-xxx-xx-xx-xx run-parts(/etc/cron.hourly)[20243]: starting application
Dec 28 19:01:02 ip-xxx-xx-xx-xx run-parts(/etc/cron.hourly)[20323]: finished application
As shown in the log, at hourly basis it will start to run anacron that performs periodic command scheduling which is traditionally done by cron, and then call every other files in the directory and run the commands in each files.
0