Working with Timezones in Stock Timeseries Data Using Pandas
Analysts using timeseries data will frequently need to change the timezone of the series to sync up with specific events (such as market closes, or an economic data release). Fortunately, the Pandas python library provides a robust set of functions to handle timezone change.
For basic Pandas setup and installation you can check out the Pandas/Miniconda setup guide
First, we start off by importing the pandas library and reading the OHLCV datafile into a Pandas dataframe:
column_names = ["timestamp", "open", "high", "low", "close", "volume"] data = pd.read_csv('your_datafile_path.csv',names=column_names, parse_dates=["Date"], tz="US/Eastern") data[timestamp] = pd.to_datetime(data[timestamp]) data.set_index(timestamp, inplace=True)
Note the above snippet assumes there is no column name header row in the csv datafile, if there is such a row the column_names declaration can be removed as Pandas will automatically name the columns.
You must set the timezone of the data when it is read into the Pandas dataframe, otherwise Pandas will assume it is the clock timezone on your local machine.
Next we can convert the times to different timezones
# Convert to UTC data_utc = data.tz_convert('Etc/UTC') # Convert to Japan Standard Time (JST) data_jst = data.tz_convert('Asia/Tokyo') # Convert to Central European Time (CET) data_cet = data.tz_convert('Europe/Paris')
For a full list of all the available timezones you can refer to this complete python timezone reference - note that the ‘TZ identifier’ column is the name to be used in the script.
Finally, the data can be saved to csv files
data_utc.to_csv('ohlvc_data_utc.csv') data_jst.to_csv('ohlvc_data_jst.csv') data_cet.to_csv('ohlvc_data_cet.csv')