agweatherqaqc package

Package Contents

A package for the correction of sensor drift and removal of bad/suspect data for weather stations. The process is CLI-driven and provides recommendations to the user for best practices.

Also generates Reference Evapotranspiration according to the ASCE Evapotranspiration Equation (2005).

This package is meant to be run by creating an instance of the agweatherqaqc.WeatherQC class and then calling process_station(), which handles running the workflow. From this point the user directs everything else through interactive prompts in the command line / terminal window.

Example:
>>> from agweatherqaqc.agweatherqaqc import WeatherQC
>>> config_path = 'test_files/test_config.ini'
>>> metadata_path = 'test_files/test_metadata.xlsx'
>>> station_qaqc = WeatherQC(config_path, metadata_path, gridplot_columns=1)
>>> station_qaqc.process_station()

The documentation on the rest of the correction and calculation functions is available to explain the methodology and thought process that went into them.

agweatherqaqc.agweatherqaqc

class agweatherqaqc.agweatherqaqc.WeatherQC(config_file_path='config.ini', metadata_file_path=None, gridplot_columns=1)[source]

Bases: object

The WeatherQC class is a holistic package for the QC of agricultural weather data.

It requires the filepath to a config file, and from that file does everything necessary for to enable the end user to QC the data from a station. The class handles the reading in of data, sanitizing of inputs, plotting and logging the correction of said data, and finally saving the final data for in other analyses.

Usually, weather stations from the same source or network have the same file structure, so WeatherQC features an optional input of the filepath to a metadata file to process multiple stations using the same input file.

# Example:
>>> from agweatherqaqc.agweatherqaqc import WeatherQC
>>> config_path = 'test_files/test_config.ini'
>>> metadata_path = 'test_files/test_metadata.xlsx'
>>> station_qaqc = WeatherQC(config_path, metadata_path, gridplot_columns=1)
>>> station_qaqc.process_station()

Both the config and metadata files have requirements on file structure and contents. The files contained within the ./test_data/ directory can serve as templates to copy from or modify.

Please see the documentation of WeatherQC.process_station(), as well as the documentation for the functions within agweatherqaqc.qaqc_functions and agweatherqaqc.calc_functions for more information on the overall process and individual steps.

process_station()[source]

This function serves as the structure for the overall workflow in applying the QC process to an input data source. The standard process is as follows:

  1. Read in the data.

  2. Calculate any secondary variables (mean monthly values, clear-sky solar radiation, etc.).

  3. Plot the data before any corrections are performed.

  4. Allow the user to adjust/remove/QC data. Recompute any dependent secondary variables.

  5. Plot the data after corrections are performed and save the output data.

Returns:

None

agweatherqaqc.calc_functions

agweatherqaqc.calc_functions.calc_compiled_ea(tmax, tmin, tavg, ea, tdew, tdew_col, rhmax, rhmax_col, rhmin, rhmin_col, rhavg, rhavg_col, tdew_ko)[source]

This function is used to create a ‘compiled’ ea from all provided humidity variables, always using the best one provided within the dataset for each given day of the record. This function will work regardless of if ea is provided by the dataset or not. See qaqc_functions.compiled_humidity_adjustment for more information.

Args:
tmax:

(ndarray) 1D array of maximum temperature values

tmin:

(ndarray) 1D array of minimum temperature values

tavg:

(ndarray) 1D array of average temperature values

ea:

(ndarray) 1D array of vapor pressure values, which may be empty

tdew:

(ndarray) 1D array of dewpoint temperature values, which may be empty

tdew_col:

(int) column of Tdew variable in data file, if it is provided

rhmax:

(ndarray) 1D array of maximum relative humidity values, which may be empty

rhmax_col:

(int) column of rhmax variable in data file, if it was provided

rhmin:

(ndarray) 1D array of minimum relative humidity values, which may be empty

rhmin_col:

(int) column of rhmin variable in data file, if it was provided

rhavg:

(ndarray) 1D array of average relative humidity values, which may be empty

rhavg_col:

(int) column of rhavg variable in data file, if it was provided

tdew_ko:

(ndarray) 1D array of tdew data filled in by tmin-ko curve

Returns:
compiled_ea:

(ndarray) 1D array of vapor pressure that has been compiled from the “best” data sources

agweatherqaqc.calc_functions.calc_humidity_variables(tmax, tmin, tavg, ea, ea_col, tdew, tdew_col, rhmax, rhmax_col, rhmin, rhmin_col, rhavg, rhavg_col)[source]

Takes in all possible humidity variables and figures out which one to use for the calculation of TDew and Ea.

Unless otherwise cited, all equations are from ASCE refet manual

Which variables used is determined by the variable column values in the input file, only those variables provided by the original data source will be used, and the decision tree follows this path:

  1. If Ea exists but Tdew doesn’t exist, use Ea to calculate Tdew.

  2. If Ea doesn’t exist but Tdew does, use Tdew to calculate Ea.

  3. If neither exist but RHmax and RHmin exist, use those to calculate both Ea and Tdew.

  4. If nothing else exists, use RHAvg to calculate both Ea and Tdew.

If both Ea and TDew exist, then the function just returns those values.

Args:
tmax:

(ndarray) 1D array of maximum temperature values

tmin:

(ndarray) 1D array of minimum temperature values

tavg:

(ndarray) 1D array of average temperature values

ea:

(ndarray) 1D array of vapor pressure values, which may be empty

ea_col:

(int) column of ea variable in data file, if it was provided

tdew:

(ndarray) 1D array of dewpoint temperature values, which may be empty

tdew_col:

(int) column of tdew variable in data file, if it was provided

rhmax:

(ndarray) 1D array of maximum relative humidity values, which may be empty

rhmax_col:

(int) column of rhmax variable in data file, if it was provided

rhmin:

(ndarray) 1D array of minimum relative humidity values, which may be empty

rhmin_col:

(int) column of rhmin variable in data file, if it was provided

rhavg:

(ndarray) 1D array of average relative humidity values, which may be empty

rhavg_col:

(int) column of rhavg variable in data file, if it was provided

Returns:
calc_ea:

(ndarray) 1D array of vapor pressure values

calc_tdew:

(ndarray) 1D array of dewpoint temperature values

agweatherqaqc.calc_functions.calc_org_and_opt_rs_tr(mc_iterations, log_path, month, delta_t, mm_delta_t, rs, rso)[source]

This function performs a monte carlo simulation on the b coefficients that go into generating thornton- running solar radiation in an attempt to optimize a model that best fits observed solar radiation data. That best fit model will then be used to fill any missing observations in actual solar radiation for the calculation of reference evapotranspiration. See the function calc_rs_tr() for more information.

The bracket size with which to generate random values is 0.5, this factor was chosen after trying different values on several stations and were a good balance of minimizing RMSE and processing speed.

When running the script on the first mode, only 100 iterations are done to save time, it may be that optimized has worse parameters than original in this case, so we just return the original parameters as the optimized

Args:
mc_iterations:

(int) number of iterations in monte carlo simulation

log_path:

(str) path to log file that we will write the b coefficients and other relevant info to

month:

(ndarray) 1D numpy array of months within dataset

delta_t:

(ndarray) 1D numpy array of difference between maximum and minimum temperature values

mm_delta_t:

(ndarray) monthly averaged delta_t (12 values total) values

rs:

(ndarray) 1D numpy array of observed solar radiation values in w/m2

rso:

(ndarray) 1D numpy array of clear-sky solar radiation values in w/m2

Returns:
org_rs_tr:

(ndarray) 1D numpy array of thornton-running solar radiation with original B coefficient values

mm_org_rs_tr:

(ndarray) 1D numpy array of monthly averaged org_rs_tr (12 values total) values

opt_rs_tr:

(ndarray) 1D numpy array of thornton-running solar radiation with optimized B coefficient values

mm_opt_rs_tr:

(ndarray) 1D numpy array of monthly averaged opt_rs_tr (12 values total) values

agweatherqaqc.calc_functions.calc_rs_tr(month, rso, delta_t, mm_delta_t, b_zero, b_one, b_two)[source]

Calculates theoretical daily solar radiation according to the Thornton and Running 1999 model. Paper can be found here: http://www.engr.scu.edu/~emaurer/chile/vic_taller/papers/thornton_running_1997.pdf

Args:
month:

(ndarray) 1D numpy array of months within dataset

rso:

(ndarray) 1D numpy array of clear-sky solar radiation values in w/m2

delta_t:

(ndarray) 1D numpy array of difference between maximum and minimum temperature values

mm_delta_t:

(ndarray) monthly averaged delta_t (12 values total) values

b_zero:

(float) first B coefficient used in calculation of rs_tr, original value is 0.031

b_one:

(float) second B coefficient used in calculation of rs_tr, original value is 0.201

b_two:

(float) third B coefficient used in the calculation of rs_tr, original value is -0.185

Returns:
rs_tr:

(ndarray) 1D numpy array of thornton-running solar radiation

mm_rs_tr:

(ndarray) mean monthly averaged rs_tr (12 values total) values

agweatherqaqc.calc_functions.calc_rso_and_refet(lat, elev, wind_anemom, doy, month, tmax, tmin, ea, uz, rs)[source]

Calculates clear-sky solar radiation and reference evapotranspiration variables using the refet package (https://github.com/DRI-WSWUP/RefET)

Args:
lat:

(float) station latitude in decimal degrees

elev:

(float) station elevation in meters

wind_anemom:

(float) height of windspeed anemometer in meters

doy:

(ndarray) 1D numpy array of day of year in record

month:

(ndarray) 1D numpy array of current month in record

tmax:

(ndarray) 1D numpy array of maximum temperature values

tmin:

(ndarray) 1D numpy array of minimum temperature values

ea:

(ndarray) 1D numpy array of vapor pressure in kPa

uz:

(ndarray) 1D numpy array of average windspeed values

rs:

(ndarray) 1D numpy array of solar radiation values

Returns:
rso:

(ndarray) 1-D array of clear sky solar radiation

monthly_rs:

(ndarray) 1-D array of monthly averaged solar radiation (12 values total) values

eto:

(ndarray) 1-D array of grass reference evapotranspiration in units mm/day

etr:

(ndarray) 1-D array of alfalfa reference evapotranspiration in units mm/day

monthly_eto:

(ndarray) 1-D array of monthly averaged grass reference ET (12 values total) values

monthly_etr:

(ndarray) 1-D array of monthly averaged alfalfa reference ET (12 values total) values

agweatherqaqc.calc_functions.calc_temperature_variables(month, tmax, tmin, tdew)[source]

Calculates the secondary temperature variables like mean monthly values

Args:
month:

(ndarray) 1D numpy array of month values for use in mean monthly calculations

tmax:

(ndarray) 1D numpy array of maximum temperature values

tmin:

(ndarray) 1D numpy array of minimum temperature values

tdew:

(ndarray) 1D numpy array of dewpoint temperature values

Returns:
delta_t:

(ndarray) the daily difference between maximum temperature and minimum temperature

monthly_delta_t:

(ndarray) monthly averaged delta_t (12 values total) values

k_not:

(ndarray) the daily difference between minimum temperature and dewpoint temperature

monthly_k_not:

(ndarray) monthly averaged k_not (12 values total) values

monthly_tmin:

(ndarray) monthly averaged minimum temperature (12 values total) values

monthly_tdew:

(ndarray) monthly averaged dewpoint temperature (12 values total) values

agweatherqaqc.qaqc_functions

agweatherqaqc.qaqc_functions.additive_corr(log_writer, start, end, var_one, var_two)[source]

Corrects provided interval with a flat, user-provided additive modifier obtained via the CLI

Args:
log_writer:

Wrapper for writing to log file.

start:

(int) starting index of correction interval.

end:

(int) ending index of correction interval.

var_one:

(ndarray) 1-D array of first variable.

var_two:

(ndarray) 1-D array of second variable, may be entirely NaN.

Returns:
corr_var_one:

(ndarray) 1-D array of first variable after correction.

corr_var_two:

(ndarray) 1-D array of second variable after correction, may be entirely NaN.

agweatherqaqc.qaqc_functions.compiled_humidity_adjustment(station, log_path, folder_path, dt_array, tmax, tmin, tavg, compiled_ea, ea, ea_col, tdew, tdew_col, tdew_ko, rhmax, rhmax_col, rhmin, rhmin_col, rhavg, rhavg_col)[source]

This function displays the ‘compiled’ ea generated from all available humidity data, and the user will have the option to overwrite sections of the ‘compiled’ ea with ea generated from a variable of their choice, should a higher priority humidity variable have worse data than a lower priority one.

Example 1:

A station has both vapor pressure (daily average calculated from 15 minute intervals) and RH Maximum and Minimum (daily values). The humidity compilation function will only use RHMax and RHMin to calculate vapor pressure if there is a gap in the provided vapor pressure data. However, for some reason the vapor pressure data is bad, either from a faulty sensor or problem with the sampling, while the contemporaneous RH data is good. This function will allow you to graphically select the ‘bad’ section of vapor pressure data and overwrite it with the vapor pressure calculated from the present RH Maximum and minimum data.

Example 2:

A station has both vapor pressure (daily average calculated from 15 minute intervals) and RH Maximum and Minimum (daily values). Data from all variables is bad for the periods of 01/2016-12/2016. This function would allow you to fill in the ‘compiled’ ea values from 2016 with values from Tmin - Ko

Args:
station:

(str) station name for saving files

log_path:

(str) path to log file

folder_path:

(str) path to correction files directory

dt_array:

(ndarray) 1-D datetime array used for bokeh plots

tmax:

(ndarray) 1-D array of maximum temperature values

tmin:

(ndarray) 1-D array of minimum temperature values

tavg:

(ndarray) 1-D array of average temperature values

compiled_ea:

(ndarray) the array of ea values that has been generated from all provided humidity variables

ea:

(ndarray) 1-D array of vapor pressure values, which may be empty

ea_col:

(int) used to determine if ea was provided by the data source

tdew:

(ndarray) 1-D array of dewpoint temperature values, which may be empty

tdew_col:

(int) column of Tdew variable in data file, if it is provided

tdew_ko:

(ndarray) 1-D array of dewpoint temperature values, where missing values are filled in by Tmin-Ko curve

rhmax:

(ndarray) 1-D array of maximum relative humidity values, which may be empty

rhmax_col:

(int) column of rhmax variable in data file, if it was provided

rhmin:

(ndarray) 1-D array of minimum relative humidity values, which may be empty

rhmin_col:

(int) column of rhmin variable in data file, if it was provided

rhavg:

(ndarray) 1-D array of average relative humidity values, which may be empty

rhavg_col:

(int) column of rhavg variable in data file, if it was provided

Returns:
edited_compiled_ea:

(ndarray) ea array that has had selected sections replaced by the selected sources

agweatherqaqc.qaqc_functions.correction(station, log_path, folder_path, var_one, var_two, dt_array, month, year, code, auto_corr=0)[source]

This main qaqc function takes in two variables and, depending on the code provided, enables different correction methods for the user to use to correct data. This function serves as the wrapper/handler for all other correction method functions. Once a correction has been applied, user has the option to do multiple iterations before finishing. All actions taken are recorded into the log file.

After each iteration a bokeh graph is generated that shows the changes that have occurred. After the user decides to completely finish with corrections, one final bokeh plot is generated that shows the final corrected product vs the uncorrected data that was initially passed in

Args:
station:

(str) station name for saving files

log_path:

(str) path to log file

folder_path:

(str) path to correction files directory

var_one:

(ndarray) 1-D numpy array of first variable passed

var_two:

(ndarray) 1-D numpy array of second variable, may be all NaN

dt_array:

(ndarray) 1-D datetime array used for bokeh plotting

month:

(ndarray) 1-D numpy array of month values

year:

(ndarray) 1-D numpy array of year values

code:

(int) used to determine what variables are actually passed as var_one and var_two

auto_corr:

(int) flag for the “automatic first pass” mode, which auto-applies default correction first

Returns:
corr_var_one:

(ndarray) 1-D numpy array of corrected var_one values

corr_var_two:

(ndarray) 1-D numpy array of corrected var_two values

agweatherqaqc.qaqc_functions.generate_interval(var_size)[source]

Generates menu and obtains user selection on what intervals the user wants to correct via the CLI

Args:
var_size:

(int) of input data size, to prevent creation of an out of bound index

Returns:
int_start:

(int) of index user wants to start correction on

int_end:

(int) of index user wants to end correction on

agweatherqaqc.qaqc_functions.modified_z_score_outlier_detection(data)[source]

Calculates the modified z scores of provided dataset and sets to nan any values that are above the threshold The modified z approach and threshold of 3.5 is recommended in:

Boris Iglewicz and David Hoaglin (1993), “Volume 16: How to Detect and Handle Outliers”, The ASQC Basic References in Quality Control: Statistical Techniques

Modified z scores are more robust than traditional z scores because they are determined by the median, which is less susceptible to outliers.

Args:
data:

(ndarray) 1-D array of values

Returns:
cleaned_data:

(ndarray) 1-D array of values that have had outliers removed

outlier_count:

(int) number of outliers removed

agweatherqaqc.qaqc_functions.multiplicative_corr(log_writer, start, end, var_one, var_two)[source]

Corrects provided interval with a user-provided multiplicative modifier obtained from the CLI

Args:
log_writer:

Wrapper for writing to log file

start:

(int) starting index of correction interval

end:

(int) ending index of correction interval

var_one:

(ndarray) 1-D numpy array of first variable

var_two:

(ndarray) 1-D numpy array of second variable, may be entirely nan’s

Returns:
corr_var_one:

(ndarray) 1-D array of first variable after correction

corr_var_two:

(ndarray) 1-D array of second variable after correction, may be entirely nan’s

agweatherqaqc.qaqc_functions.rh_yearly_percentile_corr(log_writer, start, end, rhmax, rhmin, year, percentage)[source]

Performs a year-based percentile correction on relative humidity, works on the assumption that, in areas with significant agriculture, every year should have at least a few observations where RHMax hits 100% (such as when it rains). This is a concise way to solve sensor drift issues that may arise. The correction strength is determined only by RHMax values, but the correction is also applied to RHMin values as they are obtained by the same sensor and likely suffer the same sensor drift problem.

Args:
log_writer:

Wrapper for writing to log file

start:

(int) starting index of correction interval

end:

(int) ending index of correction interval

rhmax:

(ndarray) 1-D array of rhmax values

rhmin:

(ndarray) 1-D array of rhmin

year:

(ndarray) 1-D array of year values

percentage:

(int) what top yearly percentage of observations user wants to base correction on

Returns:
corr_rhmax:

(ndarray) 1-D array of rhmax values after correction is applied

corr_rhmin:

(ndarray) 1-D array of rhmin values after correction is applied

agweatherqaqc.qaqc_functions.rs_period_ratio_corr(log_writer, start, end, rs, rso, sample_size_per_period, period)[source]

This function corrects rs by applying a correction factor (a ratio of clear-sky solar radiation (rso) over observed solar radiation (rs)) to each user defined period to counteract sensor drift and other errors.

The start and end of the correction interval is used to cut a section of both rs and rso, with these new sections being divided into user-defined periods. Each period then has a correction factor calculated based on the user-specified largest number of points for rs/rso. Averages are formed for both rs and rso of those largest points, and then this average rso is divided by this average rs to get a final ratio, which multiplied to all points within its corresponding period.

Within each period, the code checks for the existence of potential isolated erroneous readings (electrical shorts, datalogger errors, etc.), which it does by looking at how the correction factor changes by shifting which values are included.

The logic here being that erroneous values generally appear as a “spike” of Rs that is significantly higher than Rso, which would heavily influence the correction factor due to it being the ratio of averages. “Spikes” are identified through testing the rs/rso ratios of a period with a couple intuitive rules:

RULE 1:

The removal of the data from any one rso/rs ratio from the correction factor should not massively shift (>2%) the value of that correction factor if sensor drift/miscalibration has occurred. While there may be variation between individual solar radiation observations due to cloudiness, over the period each cloud-free day should have approximately the same ratio with its corresponding clear-sky solar radiation value. This logic also holds true when parsing a period that needs to drift adjustment but may have bad observations.

RULE 2:

Is the average Rs value (calculated from the underlying Rs values from the Rs/Rso ratios included in the correction factor) exceed the average Rso value by at least 75 w/m**2? The main purpose of this rule is to warn the user when RULE 1 has been satisfied but this rule has not. This would occur when the observational data has a high concentration of bad values. The recommended course of action when this rule is violated is to remove the data manually.

If RULE 1 violated, is this whole process is repeated with the next largest ratio until the correction factor doesn’t significantly change. Values determined to be bad are set to a marker value and then later set to be equal to Rso * 1.05.

Example:

sorted_ratio_list = [3, 1.5, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.7, 0.7…]

We have the original correction factor of the 1st-6th largest rs/rso ratios, and we compute it again by dropping the largest ratio and including the next largest. Ex: 1st-6th largest would become 2nd-7th largest. We check to see if this shifted in included values causes a larger than 2% change in the correction factor,

Correction factor does significantly change when computed between values 1st-6th and 2nd-7th, with the same behavior between 2nd-7th and 3rd-8th. The correction factor doesn’t change significantly when computed between values 3rd-8th and 4th-9th.

The correction factor that will be applied to the data will be based on values 3rd-8th, and the values for 1st and 2nd will set equal to the corresponding Rso values * 1.05 after the data is corrected

If a period does not contain enough valid data points to fill the user-specified number, the entire period is thrown out. To prevent the code from correcting data beyond the point of believability, if the correction factor is below 0.5 or above 1.5, the data for that period is removed instead.

In addition, if the correction factor is between 0.97 < X < 1.03, the data is unchanged under the assumption that the sensor was behaving as expected.

Finally, the function returns the corrected solar radiation that has had erroneous readings removed (if applicable) and the period-based correction factor applied. Post-correction Rs data that exceeds Rso by 3% is clipped to Rso.

Args:
log_writer:

Wrapper for writing to the log file

start:

(int) starting index of correction interval

end:

(int) ending index of correction interval

rs:

(ndarray) 1-D numpy array of rs

rso:

(ndarray) 1-D numpy array of rso

sample_size_per_period:

(int) number of points in each period correction factors are calculated with

period:

(int) length of each correction period within the user-specified interval

Returns:
corr_rs:

(ndarray) 1-D array of corrected rs values

rso:

(ndarray) 1-D array, not actually changed, is returned for consistent behavior in main qaqc function.

agweatherqaqc.qaqc_functions.set_to_nan(log_writer, start, end, var_one, var_two)[source]

Sets entire provided interval to nans, likely because the observations are bad and need to be thrown out.

Args:
log_writer:

Wrapper for writing to log file

start:

(int) starting index of correction interval

end:

(int) ending index of correction interval

var_one:

(ndarray) 1-D array of first variable

var_two:

(ndarray) 1-D array of second variable, may be entirely nan’s

Returns:
corr_var_one:

(ndarray) 1-D array of first variable after data was removed

corr_var_two:

(ndarray) 1-D array of second variable after data was removed, may be entirely nan’s

agweatherqaqc.qaqc_functions.temp_find_outliers(log_writer, var_one, var_one_name, var_two, var_two_name, month)[source]

Wrapper function for modified_z_score_outlier_detection() that will process provided temperature variables. Due to seasonal variation in temperature the overall temperature record is subset into months (ex. all January observations are grouped together) and modified_z_score_outlier_detection() is run 12 times.

Args:
log_writer:

Wrapper for writing to log file

var_one:

(ndarray) 1-D array of first variable, either tmax, or tmin

var_one_name:

(str) name for var one

var_two:

(ndarray) 1-D array of second variable, either tmin or tdew

var_two_name:

(str) name for var two

month:

(ndarray) 1-D array of month values

Returns:
corrected_var_one:

(ndarray) 1-D array of first variable after data was removed

corrected_var_two:

(ndarray) 1-D array of second variable after data was removed

agweatherqaqc.utils

agweatherqaqc.utils.determine_delimiter(file_path)[source]

Uses the csv.Sniffer class to determine the delimiter of an input file Will parse the first 5 lines and raise an error if the delimiter is not consistent

Args:
file_path:

(str) path to file to parse

Returns:
delim:

(str) delimiter for the input file, to be used in pandas.read_csv()

agweatherqaqc.utils.get_float_input(prompt='Enter your choice: ')[source]

Prompts the user for a float input, with handling for if the input is bad

Args:
prompt:

(str) prompt to display to the user

Returns:
float_input:

(float) sanitized value entered by the user

agweatherqaqc.utils.get_int_input(start_val, end_val, prompt='Enter your choice: ')[source]

Prompts the user for an integer input within a specified range, with handling for if the input is bad or falls outside the expected range.

Args:
start_val:

(int) start of acceptable integer values

end_val:

(int) end of acceptable integer values

prompt:

(str) prompt to display to the user

Returns:
int_input:

(int) sanitized integer value entered by the user

agweatherqaqc.utils.validate_file(file_path, expected_extensions)[source]

Checks to see if provided path is valid, while also checking to see if file is of expected type. Raises exceptions if either of those fail. Returns nothing.

Args:
file_path:

(str) path to file

expected_extensions:

(list) possible expected file types