#### UNIX Script Documentation Block # # Script name: ingest_query # # JIF contact: Keyser org: NP22 date: 2010-09-09 # # Abstract: Determines the availability of a group of files on a remote unix # machine. The file containing the list of available data sets is returned in # script parameter 2. # # Script history log: # 2006-05-12 D. Keyser Original version for implementation. Combines/ # generalizes previous scripts ingest_cemscsquery (for CEMSCS machine # only) and ingest_unixquery (for unix machines only). Changed to account # for all remote machines now being unix (since MVS CEMSCS machine was # replaced with unix DDS machine). Improved documentation and comments, # more appropriate messages posted to joblog. # 2006-09-29 D. Keyser If query fails on first attempt (for whatever reason) # now sleeps 30 sec and tries a second time, if this also fails script # gives up (allows query to bypass possible momentary ftp glitches). # 2007-05-14 D. Keyser Now uses imported script variable ITRIES_MAX_QUERY # to determine maximum number of failed attempts to query files from remote # machine via ftp before giving up (had been hardwired to "2"). # 2008-01-31 D. Keyser Now treats embedded asterisk ("*") characters in # file group name as wildcards matching any string rather than as a # wildcard matching exactly 1 character per asterisk. Now treats question # mark ("?") characters in file group name as wildcards matching exactly 1 # character. If file group name contains 1 or more "*" or "?" characters, # then an asterisk is never placed at end of file group name when doing # query (i.e., files to be queried exactly match end of file group name). # Removed logic which used awk to extract file names from ftp listing - # this is no longer needed now that MVS CEMSCS machine has been retired. # 2010-06-24 P. O'Reilly Modified to remove section that creates special # .netrc for the gp16.ssd.nesdis.noaa.gov system. This system has been # retired. # 2010-07-06 D. Keyser Modified to pull files from either CCS machine via # sftp. Will not expand filenames which include substitution characters # "?" and "*" in case files being sftp'd are on same CCS machine as that in # which the job is running. # 2010-09-09 B. Katz Modified to check to see if the lines returned by the # 'ls' command include the full directory path (which is expected in down- # stream processing). If not, adds the directory in front of the filename # while creating $DIRFILE. Otherwise, it passes the lines through to # $DIRFILE unchanged. This is needed because, for the first time, the 'ls' # command returns only filenames when querying AMSR-E files on # machine ftp.misst.org. One caveat: this may not work if the ftp 'ls' # request includes a directory containing wildcard characters. Only the # filename may contain such characters. # # Script parameters: (1) remote_machine # (2) directory_listing_file (output: path to file # containing listing of all available files on remote # unix machine) # (3) file_group (partial remote filename to look for) # # Modules and files referenced: # scripts : $DATA/postmsg # data cards : none # executables: none # # Remarks: Invoked by the ush script ingest_process_onetype_neworbits. # # Condition codes: # 0 - no problem encountered # > 0 - some problem encountered # Specifically: 1 - Query of file(s) failed # # Attributes: # Language: aix unix script # Machine: NCEP CCS #### set -au echo echo "#######################################################################" echo " START INGEST_QUERY " echo "#######################################################################" echo DEBUGSCRIPTS=${DEBUGSCRIPTS:-OFF} if [ $DEBUGSCRIPTS = ON -o $DEBUGSCRIPTS = YES ] ; then set -x fi MACHINE=$1 DIRFILE="$2" fname=$3 # If the file group name contains one or more embedded asterisks ("*" - # wildcard matching any string of 1 or more characters) or one or more # question marks ("?" - wildcard matching exactly 1 character ), then DON'T # put an asterisk at the end of the file group name (i.e., the query will # look only for files whose ending characters exactly match the file group # name ending characters, including "?"'s). # The file group name may have "?" as the last character but should never # have "*" as the last character. # --------------------------------------------------------------------------- echo "$REMOTEDSNGRP" | grep -Fe "*" -Fe "?" iret=$? if [ $iret -eq 0 ]; then REMOTEDSNGRP="$fname" else REMOTEDSNGRP="$fname*" fi ftout=$DATA/ftout$$ # Get a listing of REMOTEDSNGRP files from the remote unix machine using ftp # -------------------------------------------------------------------------- set +x echo echo "Time is now $(date)." echo [ $DEBUGSCRIPTS = ON -o $DEBUGSCRIPTS = YES ] && set -x ftperror=99 itries=1 while [ $ftperror -gt 0 -a $itries -le $ITRIES_MAX_QUERY ]; do [ -s $DATA/ftpquery.output$$ ] && rm $DATA/ftpquery.output$$ if [ $itries -gt 1 ]; then cwd=`pwd` cd $DATA msg="QUERY OF $3 FILES FAILED!!!! - SLEEP 30 SEC AND TRY AGAIN." $DATA/postmsg "$jlogfile" "$msg" cd $cwd sleep 30 fi if [ $MACHINE = prodccs.ncep.noaa.gov -o $MACHINE = devccs.ncep.noaa.gov ] then echo echo "Use SFTP." echo sftp -v $MACHINE <$ftout 2>&1 lls $REMOTEDSNGRP > $DATA/ftpquery.output$$ quit EOH_sftp else echo echo "Use FTP." echo ftp -vi $MACHINE <$ftout 2>&1 passive ls $REMOTEDSNGRP $DATA/ftpquery.output$$ quit EOH_ftp fi ### end ftp instruction input ### ftperror=$? # Cat out the standard output from the ftp and remove it # ------------------------------------------------------ set +x echo cat $ftout echo [ $DEBUGSCRIPTS = ON -o $DEBUGSCRIPTS = YES ] && set -x rm $ftout [ ! -s $DATA/ftpquery.output$$ ] && ftperror=1 itries=`expr $itries + 1` done itries=`expr $itries - 1` set +x echo echo "Time is now $(date)." echo [ $DEBUGSCRIPTS = ON -o $DEBUGSCRIPTS = YES ] && set -x # If there was an error in the ftp (including if no listing was produced) # then exit w/ return code 1 # ----------------------------------------------------------------------- if [ $ftperror -ne 0 ]; then [ -s $DATA/ftpquery.output$$ ] && rm $DATA/ftpquery.output$$ cwd=`pwd` cd $DATA msg="Exiting with rc = 1 - query of $3 files on remote unix machine \ $MACHINE failed after $itries tries --> non-fatal" $DATA/postmsg "$jlogfile" "$msg" cd $cwd set +x echo echo " Query of file group $3 on remote unix machine $MACHINE failed \ after $itries tries. " echo exit 1 fi # Continue on if no ftp problems # ------------------------------ cwd=`pwd` cd $DATA msg="QUERY OF $3 FILES successful on try no. ${itries}." $DATA/postmsg "$jlogfile" "$msg" cd $cwd # Check to see if the lines returned by the 'ls' command include the full # directory path - if not add the directory in front of the filename here # while creating $DIRFILE. # ------------------------------------------------------------------------ echo `head -n1 $DATA/ftpquery.output$$` | grep / err_grep=$? if [ $err_grep -eq 0 ]; then cp $DATA/ftpquery.output$$ $DIRFILE else direct=$(dirname $REMOTEDSNGRP) cat $DATA/ftpquery.output$$ | { read directfilename iret=$? while (( $iret == 0 )) ; do filename=$(basename $directfilename) if [[ $filename = $directfilename ]] ; then echo ${direct}/$filename else echo $directfilename fi read directfilename iret=$? done } > $DIRFILE fi rm $DATA/ftpquery.output$$ exit