Script to archive and zip files based on elements in the filename - HELP NE

Hi,

I have a shell script which is to perform a check if particular type of files exists in a directory. The files come in the form of:

1003_oxnn2_BSC48379.1.200603130500.768741

1003_oxnn2_BSC48379.2.200603130500.768742

1003_oxnn2_BSC48379.72.200603130500.768753

1003_oxnn2_BSC48379.79.200603130500.768758

1004_oxnn3_BSC48380.1.200603150510.768759

1004_oxnn3_BSC48380.2.200603150510.768760

1004_oxnn3_BSC48380.72.200603150510.768761

1004_oxnn3_BSC48380.79.200603150510.768762

1003_oxnn2_BSC48379.1.200603160700.768741

1003_oxnn2_BSC48379.2.200603160700.768742

1003_oxnn2_BSC48379.72.200603160700.768753

1003_oxnn2_BSC48379.79.200603160700.768758

These files could be represented as follows in the script:

<OSS_Instance>_<OSS_Name>_<BSC ID>.<counter no>.<timestamp>.<id>

where:

i) timestamp is in the format: YYYYMMDDHHMM

ii) counter no are nos 1,2, 72 and 79 only

Here抯 what the script is to do:

1) For all of these files located in the directory, it needs to check to see if all files grouped by their

OSS_Instance, OSS_Name, BSC ID, counter_no and timestamp

It has to ensure that all 4 counter_no i.e. 1, 2, 72 and 79 for the specific OSS_Instance, OSS_Name, BSC ID and timestamp files have arrived in the particular directory within the specified timeout period. If so, then archive and zip these files grouped by the OSS_Instance, OSS_Name, BSC ID and timestamp and their 4 counter_no into a file of the following naming format:

<OSS_Instance>_<OSS_Name>_<BSC ID>.<timestamp>.tar.gz

As an example, the files listed below,

1003_oxnn2_BSC48379.1.200603130500.768741

1003_oxnn2_BSC48379.2.200603130500.768742

1003_oxnn2_BSC48379.72.200603130500.768753

1003_oxnn2_BSC48379.79.200603130500.768758

1004_oxnn3_BSC48380.1.200603150510.768759

1004_oxnn3_BSC48380.2.200603150510.768760

1004_oxnn3_BSC48380.72.200603150510.768761

1004_oxnn3_BSC48380.79.200603150510.768762

1003_oxnn2_BSC48379.1.200603160700.768741

1003_oxnn2_BSC48379.2.200603160700.768742

1003_oxnn2_BSC48379.72.200603160700.768753

1003_oxnn2_BSC48379.79.200603160700.768758

would be archived and zip-ed as:

1003_oxnn2_ BSC48379. 200603130500.tar.gz

[contents:

1003_oxnn2_BSC48379.1.200603130500.768741

1003_oxnn2_BSC48379.2.200603130500.768742

1003_oxnn2_BSC48379.72.200603130500.768753

1003_oxnn2_BSC48379.79.200603130500.768758

]

1004_oxnn3_BSC48380. 200603150510.tar.gz

[ contents:

1004_oxnn3_BSC48380.1.200603150510.768759

1004_oxnn3_BSC48380.2.200603150510.768760

1004_oxnn3_BSC48380.72.200603150510.768761

1004_oxnn3_BSC48380.79.200603150510.768762

]

1003_oxnn2_BSC48379. 200603160700.tar.gz

[ contents:

1003_oxnn2_BSC48379.1.200603160700.768741

1003_oxnn2_BSC48379.2.200603160700.768742

1003_oxnn2_BSC48379.72.200603160700.768753

1003_oxnn2_BSC48379.79.200603160700.768758

]

The scenario above is also applicable if all files of the 4 counter_no for the specific

OSS_Instance, OSS_Name, BSC ID and timestamp have arrived even before the timeout period.

If some files for any of the 4 counter_no i.e. 1, 2, 72 and 79 for the specific OSS_Instance, OSS_Name, BSC ID and timestamp files have not arrived in the particular directory after the specified timeout period, the script then takes whatever files which have already arrived in that directory, and archive and zip them according to the same zip file naming convention mentioned above:

Example:

The files listed below are the only files that have arrived after the specified timeout period.

1003_oxnn2_BSC48379.1.200603130500.768741

1003_oxnn2_BSC48379.2.200603130500.768742

1003_oxnn2_BSC48379.72.200603130500.768753

1004_oxnn3_BSC48380.1.200603150510.768759

1004_oxnn3_BSC48380.2.200603150510.768760

1003_oxnn2_BSC48379.2.200603160700.768742

1003_oxnn2_BSC48379.72.200603160700.768753

Therefore, the script archives and zips these files respectively to the files:

1003_oxnn2_ BSC48379. 200603130500.tar.gz

[

contents:

1003_oxnn2_BSC48379.1.200603130500.768741

1003_oxnn2_BSC48379.2.200603130500.768742

1003_oxnn2_BSC48379.72.200603130500.768753

]

1004_oxnn3_BSC48380. 200603150510.tar.gz

[

contents:

1004_oxnn3_BSC48380.1.200603150510.768759

1004_oxnn3_BSC48380.2.200603150510.768760

]

1003_oxnn2_BSC48379. 200603160700.tar.gz

[

contents:

1003_oxnn2_BSC48379.2.200603160700.768742

1003_oxnn2_BSC48379.72.200603160700.768753

]

However, if the timeout period has not expired and if not all of the files of 4 counter_no for the specific OSS_Instance, OSS_Name, BSC ID and timestamp have arrived, then the script waits until all files of the 4 counter_no for the specific OSS_Instance, OSS_Name, BSC ID and timestamp have arrived.

The timeout period referred to here is that the arithmetic difference between the current time and the file modification time.

The problem:

My script is unable to check for the arrival of files of the 4 counter_no for the specific OSS_Instance, OSS_Name, BSC ID and timestamp have arrived in order to archive and zip them according to these elements i.e. counter_no, OSS_Instance, OSS_Name, BSC ID and timestamp i.e.

1003_oxnn2_BSC48379. 200603160700.tar.gz for ANY or ALL of the files:

1003_oxnn2_BSC48379.1.200603160700.768741

1003_oxnn2_BSC48379.2.200603160700.768742

1003_oxnn2_BSC48379.72.200603160700.768753

1003_oxnn2_BSC48379.79.200603160700.768758

1003_oxnn2_ BSC48379. 200603130500.tar.gz for ANY or ALL of the files:

1003_oxnn2_BSC48379.1.200603130500.768741

1003_oxnn2_BSC48379.2.200603130500.768742

1003_oxnn2_BSC48379.72.200603130500.768753

1003_oxnn2_BSC48379.79.200603130500.768758

1004_oxnn3_BSC48380. 200603150510.tar.gz for ANY or ALL of the files:

1004_oxnn3_BSC48380.1.200603150510.768759

1004_oxnn3_BSC48380.2.200603150510.768760

1004_oxnn3_BSC48380.72.200603150510.768761

1004_oxnn3_BSC48380.79.200603150510.768762

Below is the output captured from the script execution:

DATE is Mon Mar 27 22:12:34 MYT 2006

file is 1003_oxnn2_BSC48379.1.200603130507.768741

GID is 768741

COUNTER_GRP is 1

HERE ....

FDAY 27

FHOUR 19

FMINUTE 40

CDAY is 27

CHOUR is 22

CMINUTE is 12

TOTAL_HOURS away is 3

TOTAL_MINUTES is 152

TOTAL_MINUTES exceeds TIMEOUT

FILES MISSING OR DID NOT ARRIVE AFTER TIMEOUT

ERROR_TYPE is MISSING_FILES

flogName is NOKIA-RAN.RAN.MISSING_FILES.200603272212.err

ERROR_TYPE is TIMEOUT_EXPIRED

flogName is NOKIA-RAN.RAN.TIMEOUT_EXPIRED.200603272212.err

file is 1003_oxnn2_BSC48379.2.200603130513.768742

GID is 768742

COUNTER_GRP is 2

HERE ....

FDAY 27

FHOUR 21

FMINUTE 28

CDAY is 27

CHOUR is 22

CMINUTE is 12

TOTAL_HOURS away is 1

TOTAL_MINUTES is 44

TOTAL_MINUTES exceeds TIMEOUT

FILES MISSING OR DID NOT ARRIVE AFTER TIMEOUT

ERROR_TYPE is MISSING_FILES

flogName is NOKIA-RAN.RAN.MISSING_FILES.200603272212.err

ERROR_TYPE is TIMEOUT_EXPIRED

flogName is NOKIA-RAN.RAN.TIMEOUT_EXPIRED.200603272212.err

file is 1003_oxnn2_BSC48379.72.200603130511.768753

GID is 768753

COUNTER_GRP is 72

HERE ....

FDAY 27

FHOUR 21

FMINUTE 26

CDAY is 27

CHOUR is 22

CMINUTE is 12

TOTAL_HOURS away is 1

TOTAL_MINUTES is 46

TOTAL_MINUTES exceeds TIMEOUT

FILES MISSING OR DID NOT ARRIVE AFTER TIMEOUT

ERROR_TYPE is MISSING_FILES

flogName is NOKIA-RAN.RAN.MISSING_FILES.200603272212.err

ERROR_TYPE is TIMEOUT_EXPIRED

flogName is NOKIA-RAN.RAN.TIMEOUT_EXPIRED.200603272212.err

file is 1003_oxnn2_BSC48379.79.200603130512.768758

GID is 768758

COUNTER_GRP is 79

HERE ....

FDAY 27

FHOUR 21

FMINUTE 28

CDAY is 27

CHOUR is 22

CMINUTE is 12

TOTAL_HOURS away is 1

TOTAL_MINUTES is 44

TOTAL_MINUTES exceeds TIMEOUT

NUM_FILES IN DIR EQUALS NUMBER OF EXPECTED FILES

ERROR_TYPE is TIMEOUT_EXPIRED

flogName is NOKIA-RAN.RAN.TIMEOUT_EXPIRED.200603272212.err

file is 1004_oxnn3_BSC48380.1.200603130507.768741

GID is 768741

COUNTER_GRP is 1

HERE ....

FDAY 27

FHOUR 21

FMINUTE 31

CDAY is 27

CHOUR is 22

CMINUTE is 12

TOTAL_HOURS away is 1

TOTAL_MINUTES is 41

TOTAL_MINUTES exceeds TIMEOUT

FILES MISSING OR DID NOT ARRIVE AFTER TIMEOUT

ERROR_TYPE is MISSING_FILES

flogName is NOKIA-RAN.RAN.MISSING_FILES.200603272212.err

ERROR_TYPE is TIMEOUT_EXPIRED

flogName is NOKIA-RAN.RAN.TIMEOUT_EXPIRED.200603272212.err

file is 1004_oxnn3_BSC48380.2.200603130507.768741

GID is 768741

COUNTER_GRP is 2

HERE ....

FDAY 27

FHOUR 21

FMINUTE 31

CDAY is 27

CHOUR is 22

CMINUTE is 12

TOTAL_HOURS away is 1

TOTAL_MINUTES is 41

TOTAL_MINUTES exceeds TIMEOUT

FILES MISSING OR DID NOT ARRIVE AFTER TIMEOUT

ERROR_TYPE is MISSING_FILES

flogName is NOKIA-RAN.RAN.MISSING_FILES.200603272212.err

ERROR_TYPE is TIMEOUT_EXPIRED

flogName is NOKIA-RAN.RAN.TIMEOUT_EXPIRED.200603272212.err

/tmp/osspkg2s/uma/umabass/cs_bsspm_aif/client/out/1003_oxnn2_BSC48379.1.2006031 30507.768741 /tmp/osspkg2s/uma/umabass/cs_bsspm_aif/client/out/1003_oxnn2_BSC48379.2.2006031 30513.768742 /tmp/osspkg2s/uma/umabass/cs_bsspm_aif/client/out/1003_oxnn2_BSC48379.72.200603 130511.768753 /tmp/osspkg2s/uma/umabass/cs_bsspm_aif/client/out/1003_oxnn2_BSC48379.79.200603 130512.768758 /tmp/osspkg2s/uma/umabass/cs_bsspm_aif/client/out/1004_oxnn3_BSC48380.1.2006031 30507.768741 /tmp/osspkg2s/uma/umabass/cs_bsspm_aif/client/out/1004_oxnn3_BSC48380.2.2006031 30507.768741

-n Inside tarAndZip_file() ...

TAR_FILE_NAME is 1004_oxnn3_BSC48380.200603130507.tar

a /tmp/osspkg2s/uma/umabass/cs_bsspm_aif/client/out/1004_oxnn3_BSC48380.2.2006031 30507.768741 1K

The script is provided as follows:

#!/bin/sh

# Set the path for the Unix executables

AWK="/usr/bin/awk"

TAR="/usr/bin/tar"

GZIP="/usr/bin/gzip"

GREP="/usr/xpg4/bin/grep"

VENDOR_TECH="NOKIA-RAN"

FTYPE="RAN"

BASE_DIR="/tmp/osspkg2s/uma/umabass/cs_bsspm_aif/client"

INPUT_DIR="$BASE_DIR/in"

OUTPUT_DIR="$BASE_DIR/out"

TEMP_DIR="$BASE_DIR/tmp"

LOG_DIR="$BASE_DIR/log"

DATE=`date`

echo DATE is $DATE

if [ ! -d "$INPUT_DIR" -o ! -d "$OUTPUT_DIR" -o ! -d "$TEMP_DIR" -o ! -d "$LOG_DIR" ] ; then

echo "Directories missing. Please check if input, output, temp and log directories are created."

exit 1

fi

# Timeout in minutes to wait the missing file

TIMEOUT=5

TOTAL_MINUTES=0

# Function to check the timeout for generating the TAR, even

# if a file is still missing

check_timeout() {

# File date/time

## FDAY, FHOUR and FMINUTE values are being passed by arguments into check_timeout

FDAY=$1

FHOUR=$2

FMINUTE=$3

echo FDAY $FDAY

echo FHOUR $FHOUR

echo FMINUTE $FMINUTE

# Current date/time

CDAY=`date +%d`

CHOUR=`date +%H`

CMINUTE=`date +%M`

echo CDAY is $CDAY

echo CHOUR is $CHOUR

echo CMINUTE is $CMINUTE

E_DAYS=`expr $CDAY - $FDAY`

#echo E_DAYS is $E_DAYS

E_HOURS=`expr $CHOUR - $FHOUR`

#echo E_HOURS is $E_HOURS

E_MINS=`expr $CMINUTE - $FMINUTE`

#echo E_MINS is $E_MINS

if [ $E_DAYS -gt 0 ] ; then

echo E_DAYS greater than 0

TOTAL_HOURS=`expr $E_DAYS \* 24 + $E_HOURS`

echo TOTAL_HOURS here is $TOTAL_HOURS

elif [ $E_DAYS -lt 0 ] ; then

echo E_DAYS less than 0

TOTAL_HOURS=`expr 24 + $E_HOURS`

echo TOTAL_HOURS there is $TOTAL_HOURS

else

#echo TOTAL_HOURS is E_HOURS

TOTAL_HOURS=$E_HOURS

echo TOTAL_HOURS away is $TOTAL_HOURS

fi

TOTAL_MINUTES=`expr $TOTAL_HOURS \* 60 + $E_MINS`

echo TOTAL_MINUTES is $TOTAL_MINUTES

}

# Error messages

MSG_TIMEOUT_EXPIRED="TIMEOUT waiting files for group has expired."

MSG_MISSING_FILES="The number of files is lower than expected. It will wait until TIMEOUT expires."

MSG_INVALID_FILE_NAME="The input filename is invalid. TAR file cannot be generated for this file."

MSG_PRG_EXECUTION_FAILED="The execution of this program has failed. Files have been moved to the TEMP directory."

# Result flag to check if it is ok to TAR the files

RESULT=""

ALL_PRESENT=""

NUM_FILES=0

NUM_EXPECTED_FILES=4

cd $INPUT_DIR

## file format: <OSSInstance>_<OSSName>_BSC<Nokia BSC ID>.<counter no>.<YYYYMMDDHHMM>.<gid>

NOKIA_BSC_ID=""

OSS_INSTANCE=""

OSS_NAME=""

TIMESTAMP=""

# Function to create the error log file

create_log() {

TS=`date +%Y%m%d%H%M`

#ID=$1

ERROR_TYPE=$2

echo ERROR_TYPE is $ERROR_TYPE

case $ERROR_TYPE in

TIMEOUT_EXPIRED) MSG=$MSG_TIMEOUT_EXPIRED ;;

MISSING_FILES) MSG=$MSG_MISSING_FILES ;;

INVALID_FILE_NAME) MSG=$MSG_INVALID_FILE_NAME ;;

PRG_EXECUTION_FAILED) MSG=$MSG_PRG_EXECUTION_FAILED ;;

esac

#flogname="$VENDOR_TECH.$FTYPE.$ERROR_TYPE.$ID.$TS.err"

flogname="$VENDOR_TECH.$FTYPE.$ERROR_TYPE.$TS.err"

echo flogName is $flogname

#echo "[`date`] ID: $ID - $MSG" > $TEMP_DIR/$flogname;

}

tarAndZip_file()

{

#cd $OUTPUT_DIR

cd $TEMP_DIR

OSS_INSTANCE=$1

OSS_NAME=$2

NOKIA_BSC_ID=$3

TIMESTAMP=$4

echo -n "Inside tarAndZip_file() ..."

TAR_FILE_NAME=${OSS_INSTANCE}_${OSS_NAME}_"BSC"${NOKIA_BSC_ID}.${TIMESTAMP}.tar

echo TAR_FILE_NAME is $TAR_FILE_NAME

$TAR -cvf $TAR_FILE_NAME $OUTPUT_DIR/${OSS_INSTANCE}_${OSS_NAME}_${BSC_VAL}.${COUNTER_GRP}.${TIMESTAMP}. ${GID}

$GZIP $TAR_FILE_NAME

#echo $TAR_FILE_NAME is now $tarZipFile

}

## Loop all files in the directory to check if there's a match

for file in `ls $INPUT_DIR | $GREP -E '^[a-zA-Z0-9]+_[a-zA-Z0-9]+_[a-zA-Z0-9]+\.(1|2|72|79)\.[0-9]{12}\.[0-9]'`

do

echo file is $file

if [ -f $file ];then

OSS_INSTANCE=`echo $file | awk 'BEGIN { FS = "_"} {print $1}'` ##1003

#echo OSS_INSTANCE $OSS_INSTANCE

OSS_NAME=`echo $file | awk 'BEGIN { FS = "_"} {print $2}'`## oxnn2

#echo OSS_NAME $OSS_NAME

TMP_VAR=`echo $file | awk 'BEGIN { FS = "_"} {print $3}'`

#echo TMP_VAR $TMP_VAR

BSC_VAL=`echo $TMP_VAR |awk 'BEGIN { FS = "."}{print $1}'` ## BSC34565

#echo BSC_VAL is $BSC_VAL

NOKIA_BSC_ID=`echo $BSC_VAL|cut -c4-`

#echo NOKIA_BSC_ID is $NOKIA_BSC_ID

TIMESTAMP=`echo $TMP_VAR |awk 'BEGIN { FS = "."}{print $3}'` ##200603130500

#echo TIMESTAMP is $TIMESTAMP

GID=`echo $TMP_VAR |awk 'BEGIN { FS = "."}{print $4}'`

echo GID is $GID

COUNTER_GRP=`echo $TMP_VAR |awk 'BEGIN { FS = "."}{print $2}'`

echo COUNTER_GRP is $COUNTER_GRP

GRP_DAY=`ls -l $file | $AWK 'BEGIN { FS = " "} { print $7 }'`

#echo GRP_DAY $GRP_DAY

hrmin=`ls -l $file | $AWK 'BEGIN { FS = " "} { print $8 }'` ##11:18

#echo hrmin $hrmin

GRP_HOUR=`echo $hrmin | $AWK 'BEGIN { FS = ":"} { print $1 }'`

#echo GRP_HOUR $GRP_HOUR

GRP_MIN=`echo $hrmin | $AWK 'BEGIN { FS = ":"} { print $2 }'`

#echo GRP_MIN $GRP_MIN

NUM_FILES=`expr $NUM_FILES + 1`

#echo NUM_FILES now is $NUM_FILES

if [ -z $OSS_INSTANCE -o -z $OSS_NAME -o -z $NOKIA_BSC_ID -o -z $TIMESTAMP ];then

create_log "$file" "INVALID_FILE_NAME"

rm $file

continue

else

## valid filename. Therefore, a potential candidate file to be tar-ed & gz-ed

## Check if all files are in the dir. Yes - check timeout period

#if [ `ls ${OSS_INSTANCE}_${OSS_NAME}_${BSC_VAL}.*.${TIMESTAMP}.* |wc -l` -eq 4 ];then

#TMP_NUM_FILES=`ls ${OSS_INSTANCE}_${OSS_NAME}_${BSC_VAL}.*.${TIMESTAMP}.* |wc -l`

#if [ $TMP_NUM_FILES -eq 4 ];then

echo "HERE ...."

RESULT="OK"

check_timeout $GRP_DAY $GRP_HOUR $GRP_MIN

if [ $TOTAL_MINUTES -ge $TIMEOUT ] ; then

echo TOTAL_MINUTES exceeds TIMEOUT

#cp $file $TEMP_DIR

if [ $NUM_FILES -eq $NUM_EXPECTED_FILES ] ; then

ALL_PRESENT="OK"

echo "NUM_FILES IN DIR EQUALS NUMBER OF EXPECTED FILES"

### NEED TO COPY ALL FILES, NOT JUST 1 FILE!!!

cp $file $OUTPUT_DIR

#tarAndZip_file $OSS_INSTANCE $OSS_NAME $NOKIA_BSC_ID $TIMESTAMP

else

echo FILES MISSING OR DID NOT ARRIVE AFTER TIMEOUT

create_log "$file" "MISSING_FILES"

cp $file $OUTPUT_DIR

#tarAndZip_file $OSS_INSTANCE $OSS_NAME $NOKIA_BSC_ID $TIMESTAMP

fi

create_log "$file" "TIMEOUT_EXPIRED"

## not timeout yet, check if all 4 files have arrived

## if all 4 files arrived, tar & zip them all up.

else

echo TOTAL_MINUTES below TIMEOUT

#create_log "$file" "MISSING_FILES"

if [ $NUM_FILES -eq $NUM_EXPECTED_FILES ] ; then

ALL_PRESENT="OK"

echo "NUM_FILES IN DIR EQUALS NUMBER OF EXPECTED FILES"

cp $file $OUTPUT_DIR

#tarAndZip_file $OSS_INSTANCE $OSS_NAME $NOKIA_BSC_ID $TIMESTAMP

break

fi

fi

fi

#fi

else

continue

fi

done

TAR_FILES=`find $OUTPUT_DIR -type f`

echo $TAR_FILES

if [ -n $TAR_FILES ] -a [ [$TOTAL_MINUTES -ge $TIMEOUT ] -o [ $NUM_FILES -eq $NUM_EXPECTED_FILES] ] ;then

tarAndZip_file $OSS_INSTANCE $OSS_NAME $NOKIA_BSC_ID $TIMESTAMP

else

echo NO FILES COLLECTED. NOTHING TO TAR & ZIP

fi

Could anyone kindly help me out with this problem by pointing out where did I go wrong? Greatly appreciate any examples.

Thanks

Danny

[18530 byte] By [Danny_Fang_Jing_Ling] at [2007-11-25 23:40:49]
# 1
First -- repost your script with proper indentation, using the "code" tags provided here.Make sure your code is readable before you ask someone to help!
implicate_order at 2007-7-5 18:48:33 > top of Java-index,General,Sys Admin Best Practices...