o ng=-@sddlZddlZddlZddlZddlZddlZddlm Z ej dZ ej dej e ddlmZddlZddZddZd d Zd d Zd dZddZddZddZddZddZddZddZddZdd Z d!d"Z!d#d$Z"dS)%N) timedeltaUSH_DIR) prune_datac Csd|vrd|dd}||t||dkrA|\}}|jt|dtt|d}|jt|dtt|d} || f} | S|dkrI|} | Sdt|d d }||t|) NzFATAL ERROR: One or more FCST_z_HOURS is Nonetype. This may bez# because the input string is empty.INIT)hour)hoursVALIDz FATAL ERROR: Invalid DATE_TYPE: z. Valid values arez VALID or INIT)error ValueErrorreplacemintdmaxstrupper) logger date_type date_range date_hoursfleadseinit_beginit_end valid_beg valid_end valid_rangerC/lfs/h1/ops/prod/packages/evs.v1.0.19/ush/rtofs/df_preprocessing.pyget_valid_ranges4     rcCsndd| D} dttj}tj|t|dt| dt| dt||}tj|rt t |r{| d|dd|t |||||t|t| t| t|t| dd| Dt| | | |Sd |d }d |d }||||t|d |d |d}d|d }||||t|d |)NcSg|]}t|qSrr.0modelrrr 9z"run_prune_data..tmp_zLooking for stat files in z using thez template: cSrrr )r"Z fcst_var_namerrrr$Mr%z FATAL ERROR: z exists but is empty.z Populate z and retry. z does not exist.zCreate and populate )ruuiduuid4hexospathjoinrisdirlenlistdirinforlowerr OSError)r stats_dir prune_diroutput_base_template verif_case verif_type line_typer eval_periodvar_namefcst_var_names model_listobtypedomaintmp_dirpruned_data_dire1e2rrrrun_prune_data6sJ           rEcCs8|jr|d|d|dd|ddSdS)Nz Called from :z4Empty Dataframe encountered while filtering a subsetz of input statistics...z(========================================TF)emptywarningr2)dfrZ called_fromrrr check_empty^s rJc s&dd|D}|dd}|dd}|D]tj|td}tj|s`tfddd DsG|td d |d |d |dd tdd|dq|sj|d|zqt |}t ||t| }t ||f}tj|ddd|td}dt|}||dD] }||t||<qt|||| | | | | || }z t||g}Wnty|}Ynty}z|}WYd}~nd}~wwWqtjjy}z|||d|||dWYd}~qd}~wty,}z|||d|||dWYd}~qd}~ww|r`zt|Wn(ty_}z|||d|||dWYd}~nd}~wwzt||drkWdS|jddd|WSty}z|||dWYd}~dSd}~ww)NcSrrr r!rrrr$lr%zcreate_df..rz %HZ %d %B %Yz.statc3s|] }|tvVqdSNr )r"Z group_namer#rr ss  zcreate_df..)groupsetz is not a model in .z'You might check whether the stats_dir (z ) includesz, data according to the output_base template,z given domain, variable, etc...zContinuing ...z*Creating dataframe using pruned data from T)delim_whitespaceheaderskiprowsnamesdtypezThe file in question:zThe directory in question: create_df)dropinplacez:Nonexistent dataframe. Check the logfile for more details.)strftimer,r-r.risfileanyrHdebug plot_utilget_stat_file_base_columnsget_stat_file_line_type_columnsrnp concatenatepdread_csvr0astypefloat run_filtersconcat NameErrorUnboundLocalErrorerrorsEmptyDataErrorr4shutilrmtreerJ reset_index)rr5rBr:rr> met_versionclear_prune_dirr9r= obs_var_namesinterpr@rrZ start_stringZ end_stringfpathZdf_og_colnamesZdf_line_type_colnamesZ df_colnamesZdf_tmpicol_namerIrrrMrrYhs                      rYcCs|dur|St|dvr+||djd|djd@}||dd}nt|dvrF||djd|djdB}t||d|S) N)pres upper_airFCST_LEVPOBS_LEVOBTYPEZONLYSF)sfc conus_sfc polar_sfcfilter_by_level_type)rr3 startswitheqrJ)rIrr9rrrrs" rcCs<|dur|S||d||d|@}t||d|S)NFCST_VAROBS_VARfilter_by_var_name)isinrJ)rIrr=rtrrrrs   rcCs6|dur|S||dt|}t||d|S)N INTERP_MTHDfilter_by_interp)rrrrJ)rIrrurrrrs  rcC2|dur|S||dt|}t||d|S)Nr~filter_by_obtyperrrJ)rIrr?rrrr  rcCr)NVX_MASKfilter_by_domainr)rIrr@rrrrrrcCs,tdd|dD|d<t||d|S)NcSsg|] }t|ddqS)N)int)r"leadrrrr$sz%create_lead_hours.. FCST_LEAD LEAD_HOURScreate_lead_hours)rcarrayrJrIrrrrrs rcCs&tj|ddd|d<t||d|S)NFCST_VALID_ENDz %Y%m%d_%H%M%S)formatrcreate_valid_datetime)re to_datetimerJrrrrrs rcsSrL)rq enumeraterJrr)rIrcreate_init_datetimesrcCsR|dur|S|j|t||dk|t||dk@}t||d|S)NrrKfilter_by_date_range)locrrrJ)rIrrrrrrrs rcsT|dur|St||dr|S|jfdd|t|jjD}t||d|S)Nfilter_by_hourcsg|]}|vqSrr)r"xrrrr$r%z"filter_by_hour..)rJrrrdtr)rIrrrrrrrs ( rcCsrt|||| | }t||||||||| | | |||}t|||||||||| ||||| }|dur7t||dr7d}|S)Nget_preprocessed_data)rrErYrJ)rr5r6r7r8r9r:rrr;rrr<r=rtr>r?r@rurrrsrrBrIrrrrs      rc Cspt|||}t||||}t|||}t|||}t||}t||}t||}t||||}t|||| }|SrL) rrrrrrrrr) rIrr9r=rtrur@rrrrrrri(s      ri)#r,sysror)numpyrcpandasredatetimerr environ SETTINGS_DIRr-insertabspathprune_stat_filesrr`rrErJrYrrrrrrrrrrrrirrrrs6    ( R