I have a dataset of Forex news events from 2014 to 2023 in the GMT-04:00 Eastern Time (US and Canada) time zone.
Now, I want to analyze the impact of each news on the gold price (XAUUSD). For this, I need to download historical gold price data from ForexSB. However, I’ve noticed that the time zone for the data available on ForexSB is listed as GMT-05:00 Eastern Time (US and Canada).
I need these two datasets to be aligned in terms of time zones to avoid any issues in my analysis and train in my AI model. How can I achieve this?
These time zones are confusing me.
How can I solve it?
Q :
” How can I achieve this? “
Welcome to the realms of real-world computing. Good news is, you have noticed this ( classical problem ) before feeding the AI-troll… ( many others have not … ).
2007-01-03 05:00:00+00:00 WED, 'Total Vehicle Sales', '3256', 'All Day', 'USD',
2007-01-03 05:00:00+00:00 WED, 'FOMC Meeting Minutes', '3240', '2:00pm', 'USD',
2007-01-03 05:00:00+00:00 WED, 'Trade Balance', '3260', '4:45pm', 'NZD',
2007-01-03 05:00:00+00:00 WED, 'AIG Services Index', '10194', '5:30pm', 'AUD',
2007-01-04 05:00:00+00:00 THU, 'CPI m/m', '3261', '1:45am', 'CHF',
2007-01-04 05:00:00+00:00 THU, 'Spanish Services PMI', '43885', '3:15am', 'EUR',
2007-01-04 05:00:00+00:00 THU, 'Italian Services PMI', '43435', '3:45am', 'EUR',
2007-01-04 05:00:00+00:00 THU, 'Final Services PMI', '3263', '4:00am', 'EUR',
2007-01-04 05:00:00+00:00 THU, 'Services PMI', '3264', '4:30am', 'GBP',
2007-01-04 05:00:00+00:00 THU, 'Mortgage Approvals', '10286', '', 'GBP',
2007-01-04 05:00:00+00:00 THU, 'Net Lending to Individuals m/m', '3265', '', 'GBP',
2007-01-04 05:00:00+00:00 THU, 'French 10-y Bond Auction', '43326', '4:55am', 'EUR',
2007-01-04 05:00:00+00:00 THU, 'CPI Flash Estimate y/y', '3266', '5:00am', 'EUR',
2007-01-04 05:00:00+00:00 THU, 'Italian Prelim CPI m/m', '9537', '', 'EUR',
2007-01-04 05:00:00+00:00 THU, 'GfK Consumer Confidence', '18310', '5:30am', 'GBP',
2007-01-04 05:00:00+00:00 THU, 'MPC Member Blanchflower Speaks', '12650', '7:30am', 'GBP',
2007-01-04 05:00:00+00:00 THU, 'Challenger Job Cuts y/y', '8622', '', 'USD',
2007-01-04 05:00:00+00:00 THU, 'RMPI m/m', '3242', '8:30am', 'CAD',
2007-01-04 05:00:00+00:00 THU, 'IPPI m/m', '3241', '', 'CAD',
2007-01-04 05:00:00+00:00 THU, 'Unemployment Claims', '3243', '', 'USD',
2007-01-04 05:00:00+00:00 THU, 'ISM Non-Manufacturing PMI', '3244', '10:00am', 'USD',
2007-01-04 05:00:00+00:00 THU, 'Pending Home Sales m/m', '9879', '', 'USD',
2007-01-04 05:00:00+00:00 THU, 'Factory Orders m/m', '3246', '', 'USD',
2007-01-04 05:00:00+00:00 THU, 'Crude Oil Inventories', '3267', '10:30am', 'USD',
2007-01-04 05:00:00+00:00 THU, 'Monetary Base y/y', '3268', '6:50pm', 'JPY',
2007-01-05 05:00:00+00:00 FRI, 'Foreign Currency Reserves', '42187', '3:00am', 'CHF',
2007-01-05 05:00:00+00:00 FRI, 'Halifax HPI m/m', '9357', '', 'GBP',
2007-01-05 05:00:00+00:00 FRI, 'Retail PMI', '43575', '4:00am', 'EUR',
2007-01-05 05:00:00+00:00 FRI, 'Consumer Confidence', '11250', '5:00am', 'EUR',
2007-01-05 05:00:00+00:00 FRI, 'PPI m/m', '3269', '', 'EUR',
2007-01-05 05:00:00+00:00 FRI, 'Retail Sales m/m', '3270', '', 'EUR',
2007-01-05 05:00:00+00:00 FRI, 'Unemployment Rate', '3271', '', 'EUR',
2007-01-05 05:00:00+00:00 FRI, 'Employment Change', '3247', '7:00am', 'CAD',
2007-01-05 05:00:00+00:00 FRI, 'Unemployment Rate', '3248', '', 'CAD',
2007-01-05 05:00:00+00:00 FRI, 'Non-Farm Employment Change', '3249', '8:30am', 'USD',
2007-01-05 05:00:00+00:00 FRI, 'Unemployment Rate', '3250', '', 'USD',
2007-01-05 05:00:00+00:00 FRI, 'Average Hourly Earnings m/m', '3251', '', 'USD',
2007-01-05 05:00:00+00:00 FRI, 'Ivey PMI', '3253', '10:00am', 'CAD',
2007-01-05 05:00:00+00:00 FRI, 'Natural Gas Storage', '8326', '10:30am', 'USD',
2007-01-05 05:00:00+00:00 FRI, 'Fed Chairman Bernanke Speaks', '3252', '1:45pm', 'USD',
2007-01-06 05:00:00+00:00 SAT, '',
2007-01-07 05:00:00+00:00 SUN, 'AIG Construction Index', '10199', '5:30pm', 'AUD',
2007-01-07 05:00:00+00:00 SUN, 'Bank Holiday', '3297', 'All Day', 'JPY',
2007-01-07 05:00:00+00:00 SUN, 'Building Approvals m/m', '9704', '7:30pm', 'AUD',
2007-01-07 05:00:00+00:00 SUN, 'ANZ Job Advertisements m/m', '19238', '', 'AUD',
So, the world is even more harsh in this. Besides the GMT-04:00 -vs- GMT-05:00 your alignment has also to check, when the Summer Daylight Savings time-shifts( +1 | -1 ) appear “on each side of the Atlantic”, which are not the same day or need not be even the same week … And God save us from any more administrative shifts, that cut the whole months or even years, in past alignments. So the last, we have to live with, remain the few days, that actually accrue additional “stuffing” seconds, that help re-align the known skew in global atomic-clocks’ networks ( yes, there are minutes, rare, but are, that account for 61 seconds … so our software shall survive a case to read “clock” with time 23:59:60 … Huh! )
Solution?
God bless people, who maintain for us, mortals, all these “administratively” introduced time-jumps, cross Time-Zone references ( there are even some awkward TZ-s, that are even half-an-hour shifted, some were in past two centuries, but not now, so indeed a jungle of rules to follow ).
Python time can use “absolute”-reference setting, for which Time-Zone the said time-value was actually meant.
# ( ... )
#-----------------------------------------------------------
class UTC_minus5( datetime.tzinfo ):
""" __doc__
USAGE: UTC_minus5( datetime.tzinfo )
PARAMETERS: ...
EXAMPLE: UTC_minus5( 'UTC' )
THROWS: exceptions appeared on w7, where locale-specific
encodings of UTC_minus5( 'UTC' ) responses caused
internally unhandled exceptions.
Once w7-system-locale has bee changes so as to use
US-locale details, this problem ceased to exist.
REF.s: https://labix.org/python-dateutil#head-8bf499d888b70bc300c6c8820dc123326197c00f
https://www.epochconverter.com/timezones
http://pytz.sourceforge.net/
* https://julien.danjou.info/blog/2015/python-and-timezones
"Why you really, really, should never ever deal with timezones"
"""
def tzname( self, *farg ):
return 'UTC-5'
def utcoffset( self, *farg ):
return datetime.timedelta( hours = -5 )
def dst( self, *farg ):
return datetime.timedelta( 0 )
#-----------------------------------------------------------
TZ_UTC_minus5 = UTC_minus5()
TZ_UTC = tz.gettz( 'UTC' )
#-----------------------------------------------------------
So Python “absolute”-ref’d TZ-aware times can solve your problem, as these can be converted into a Global Universal Time Coordinated ( UTC+00:00:00 ) and thus providing your model a solid ground for “aligned” inputs into your “learner-algorithm”.
# (...)
return datetime.datetime( int( aFF_DICT_stringEVENT_DATE[ :4] ), # YYYY-
int( aFF_DICT_stringEVENT_DATE[5:7] ), # -MM-
int( aFF_DICT_stringEVENT_DATE[8:10] ), # -DD
tzinfo = TZ_UTC # <_All_Day_>
)
Nota Bene:
In my past Quant models, I trained 1k5-feature-wide models for XAUUSD and later extended them with fundamentals ( yes, UTC-coordinated features, derived from fundamentals ), and this was a way harder work than expected.
Quant models ( at least during those years ) did not generalise well enough after fundamentals were added – be it for their already good predictive powers from TA-features, or by having too few samples ( as few as 1E+5 fundamentals within epoch of about 1E+9 PriceDomain samples – for details see Hoeffding, Hoeffding, Hoeffding … ),
so as Adams’ “Hitchhiker’s Guide to Galaxy” reveals to us, Earth-bound mortals – do not panic :o)