etlutils package¶
Module contents¶
etlutils root contains utility functions that do not currently have a home in any submodules.
-
etlutils.
dict_to_ordereddict
(unordered_dict)¶ Converts a dict into an OrderedDict object sorrted by key
gets the keys from the dict, sorts them and then inserts them in order into a new OrderedDict object
Parameters: unordered_dict – the dictionary to sort Returns: An OrderedDict object with the keys/values from the input in key order
-
etlutils.
get_interactive_history
()¶ Prints all the history from the Python interpreter
Submodules¶
etlutils.datafiles module¶
etlutils.datafiles contains methods useful for producing and consuming data for structured data
-
etlutils.datafiles.
dump_to_daily_json_file
(directory, year, month, day, data, datatype='')¶ saves data to a daily file stored in JSON
this function will create the directories if needed and will set up the filename in a consistent way (using the get_daily_file_path method). the format is of the structure <directory>/<year>/<year>-<month>-<day>/<datatype>.<file_extension>
Parameters: - directory – the directory under which the path to the day is created
- year – the integer year to append, will expand to four digits if fewer, does not validate
- month – the integer month to append, will expand to two digits if fewer, does not validate
- day – the integer day to append, will expand to two digits if fewer, does not validate
- data – a JSON serializable data structure
- datatype – the root of the filename
- Returns
- The path to the filename that was created
-
etlutils.datafiles.
dump_to_monthly_json_file
(directory, year, month, data, datatype='')¶ saves data to a monthly file stored in JSON
will create the directories if needed and set up the filename in a consistent way. doesn’t do any error checking, so empty strings are not illegal, but will produce crummy filenames
the format of the path is <directory>/<year>/<datatype>_<year>-<month>.json
Parameters: - directory – the directory under which the file will be created
- year – the integer year to append, will expand to four digits if fewer, does not validate
- month – the integer month to append, will expand to two digits if fewer, does not validate
- data – a JSON serializable data structure
- datatype – the root of the filename
Returns: The path to the file that was created
-
etlutils.datafiles.
dump_to_yearly_json_file
(directory, year, data, datatype='')¶ saves data to a yearly file stored in JSON
will create the directories if needed and set up the filename in a consistent way, doesn’t do any error checking, so empty strings are not illegal, but will product crummy filenames
the format of the path is <directory>/<datatype>_<year>.json
Parameters: - directory – the directory under which the file will be created
- year – the integer year to append, will expand to four digits if fewer, does not validate
- data – a JSON serializable data structure
- datatype – the root of the filename
Returns: The path to the file that was created
-
etlutils.datafiles.
find_newest_saved_month
(directory, end_year, datatype='')¶ finds the last saved file for data stored by month
walks backwards in time starting from the current date looking for the last saved data by month
Parameters: - directory – the directory under which to look
- end_year – the year to end at
- datatype – the root of the filename
Returns: The year and month of the last saved file, if end_year is reached, None, None is returned
-
etlutils.datafiles.
get_daily_file_path
(directory, datatype, year, month, day, file_extension='json', make_directories=True)¶ puts together a path to a file for data that is saved by day
will create the directories if needed and set up the filename in a consistent way. doesn’t do any error checking, so empty strings are not illegal, but will produce crummy filenames
the format is of the structure <directory>/<year>/<year>-<month>-<day>/<datatype>.<file_extension>
Parameters: - directory – the directory under which the file will be created
- datatype – the root of the filename
- year – the integer year to append, will expand to four digits if fewer, does not validate
- month – the integer month to append, will expand to two digits if fewer, does not validate
- day – the integer day to append, will expand to two digits if fewer, does not validate
- file_extension – the extention (without the .)
- make_directories – will create directory if it does not exist when True
Returns: A string path composed of the arguments
-
etlutils.datafiles.
get_monthly_file_path
(directory, datatype, year, month, file_extension='json', make_directories=True)¶ puts together a path to a file for data that is saved by month
will create the directory if needed and set up the filename in a consistent way. doesn’t do any error checking, so empty strings are not illegal, but will produce crummy filenames
the format is of the path is <directory>/<year>/<datatype>_<year>-<month>.<file_extension>
Parameters: - directory – the directory under which the file will be created
- datatype – the root of the filename
- year – the integer year to append, will expanded to four digits if fewer, does not validate
- month – the integer month to append, will expanded to two digits if fewer, does not validate
- file_extension – the extention (without the .)
- make_directories – will create directory if it does not exist when True
Returns: A string path composed of the arguments
-
etlutils.datafiles.
get_yearly_file_path
(directory, datatype, year, file_extension='json', make_directories=True)¶ puts together a path to a file for data that is saved by year
will create the directory if needed and set up the filename in a consistent way. doesn’t do any error checking, so empty strings are not illegal, but will produce crummy filenames
the format is of the structure <directory>/<datatype>_<year>.<file_extension>
Parameters: - directory – the directory under which the file will be created
- datatype – the root of the filename
- year – the integer year to append, will expanded to four digits if fewer, does not validate
- file_extension – the extention (without the .)
- make_directories – will create directory if it does not exist when True
Returns: A string path composed of the arguments
etlutils.date module¶
etlutils.date consists of functions useful for manipulating or converting dates between different formats
-
etlutils.date.
datetime_from_zulutime_string
(utc_time_string)¶ Given a utc_time_string, create a datetime.datetime object
creates a datetime object from a utc-formatted time string
Parameters: utc_time_string – utc formatting string Returns: a datetime.datetime object
-
etlutils.date.
get_date_from_timestamp
(timestamp, offset=0)¶ creates a datetime.datetime object from a timestamp with offset
Using a twitter-style timestamp and offset return a datetime object
Parameters: - timestamp – a unix-type timestamp
- offset – a timezone offset in minutes from GMT
Returns: a datetime.datetime object
-
etlutils.date.
mkdate
(datestr)¶ Creates a date object for a string in the format YYYY-MM-DD
useful for parsing a data from the command line or from a filename or in a file
Parameters: datestr – a string in the format “YYYY-MM-DD” Returns: a datetime.date object Raises: ValueError
– if the string is in the wrong format