Script to archive and zip files based on elements in the filename - HELP NE
Hi,
I have a shell script which is to perform a check if particular type of files exists in a directory. The files come in the form of:
1003_oxnn2_BSC48379.1.200603130500.768741
1003_oxnn2_BSC48379.2.200603130500.768742
1003_oxnn2_BSC48379.72.200603130500.768753
1003_oxnn2_BSC48379.79.200603130500.768758
1004_oxnn3_BSC48380.1.200603150510.768759
1004_oxnn3_BSC48380.2.200603150510.768760
1004_oxnn3_BSC48380.72.200603150510.768761
1004_oxnn3_BSC48380.79.200603150510.768762
1003_oxnn2_BSC48379.1.200603160700.768741
1003_oxnn2_BSC48379.2.200603160700.768742
1003_oxnn2_BSC48379.72.200603160700.768753
1003_oxnn2_BSC48379.79.200603160700.768758
These files could be represented as follows in the script:
<OSS_Instance>_<OSS_Name>_<BSC ID>.<counter no>.<timestamp>.<id>
where:
i) timestamp is in the format: YYYYMMDDHHMM
ii) counter no are nos 1,2, 72 and 79 only
Here抯 what the script is to do:
1) For all of these files located in the directory, it needs to check to see if all files grouped by their
OSS_Instance, OSS_Name, BSC ID, counter_no and timestamp
It has to ensure that all 4 counter_no i.e. 1, 2, 72 and 79 for the specific OSS_Instance, OSS_Name, BSC ID and timestamp files have arrived in the particular directory within the specified timeout period. If so, then archive and zip these files grouped by the OSS_Instance, OSS_Name, BSC ID and timestamp and their 4 counter_no into a file of the following naming format:
<OSS_Instance>_<OSS_Name>_<BSC ID>.<timestamp>.tar.gz
As an example, the files listed below,
1003_oxnn2_BSC48379.1.200603130500.768741
1003_oxnn2_BSC48379.2.200603130500.768742
1003_oxnn2_BSC48379.72.200603130500.768753
1003_oxnn2_BSC48379.79.200603130500.768758
1004_oxnn3_BSC48380.1.200603150510.768759
1004_oxnn3_BSC48380.2.200603150510.768760
1004_oxnn3_BSC48380.72.200603150510.768761
1004_oxnn3_BSC48380.79.200603150510.768762
1003_oxnn2_BSC48379.1.200603160700.768741
1003_oxnn2_BSC48379.2.200603160700.768742
1003_oxnn2_BSC48379.72.200603160700.768753
1003_oxnn2_BSC48379.79.200603160700.768758
would be archived and zip-ed as:
1003_oxnn2_ BSC48379. 200603130500.tar.gz
[contents:
1003_oxnn2_BSC48379.1.200603130500.768741
1003_oxnn2_BSC48379.2.200603130500.768742
1003_oxnn2_BSC48379.72.200603130500.768753
1003_oxnn2_BSC48379.79.200603130500.768758
]
1004_oxnn3_BSC48380. 200603150510.tar.gz
[ contents:
1004_oxnn3_BSC48380.1.200603150510.768759
1004_oxnn3_BSC48380.2.200603150510.768760
1004_oxnn3_BSC48380.72.200603150510.768761
1004_oxnn3_BSC48380.79.200603150510.768762
]
1003_oxnn2_BSC48379. 200603160700.tar.gz
[ contents:
1003_oxnn2_BSC48379.1.200603160700.768741
1003_oxnn2_BSC48379.2.200603160700.768742
1003_oxnn2_BSC48379.72.200603160700.768753
1003_oxnn2_BSC48379.79.200603160700.768758
]
The scenario above is also applicable if all files of the 4 counter_no for the specific
OSS_Instance, OSS_Name, BSC ID and timestamp have arrived even before the timeout period.
If some files for any of the 4 counter_no i.e. 1, 2, 72 and 79 for the specific OSS_Instance, OSS_Name, BSC ID and timestamp files have not arrived in the particular directory after the specified timeout period, the script then takes whatever files which have already arrived in that directory, and archive and zip them according to the same zip file naming convention mentioned above:
Example:
The files listed below are the only files that have arrived after the specified timeout period.
1003_oxnn2_BSC48379.1.200603130500.768741
1003_oxnn2_BSC48379.2.200603130500.768742
1003_oxnn2_BSC48379.72.200603130500.768753
1004_oxnn3_BSC48380.1.200603150510.768759
1004_oxnn3_BSC48380.2.200603150510.768760
1003_oxnn2_BSC48379.2.200603160700.768742
1003_oxnn2_BSC48379.72.200603160700.768753
Therefore, the script archives and zips these files respectively to the files:
1003_oxnn2_ BSC48379. 200603130500.tar.gz
[
contents:
1003_oxnn2_BSC48379.1.200603130500.768741
1003_oxnn2_BSC48379.2.200603130500.768742
1003_oxnn2_BSC48379.72.200603130500.768753
]
1004_oxnn3_BSC48380. 200603150510.tar.gz
[
contents:
1004_oxnn3_BSC48380.1.200603150510.768759
1004_oxnn3_BSC48380.2.200603150510.768760
]
1003_oxnn2_BSC48379. 200603160700.tar.gz
[
contents:
1003_oxnn2_BSC48379.2.200603160700.768742
1003_oxnn2_BSC48379.72.200603160700.768753
]
However, if the timeout period has not expired and if not all of the files of 4 counter_no for the specific OSS_Instance, OSS_Name, BSC ID and timestamp have arrived, then the script waits until all files of the 4 counter_no for the specific OSS_Instance, OSS_Name, BSC ID and timestamp have arrived.
The timeout period referred to here is that the arithmetic difference between the current time and the file modification time.
The problem:
My script is unable to check for the arrival of files of the 4 counter_no for the specific OSS_Instance, OSS_Name, BSC ID and timestamp have arrived in order to archive and zip them according to these elements i.e. counter_no, OSS_Instance, OSS_Name, BSC ID and timestamp i.e.
1003_oxnn2_BSC48379. 200603160700.tar.gz for ANY or ALL of the files:
1003_oxnn2_BSC48379.1.200603160700.768741
1003_oxnn2_BSC48379.2.200603160700.768742
1003_oxnn2_BSC48379.72.200603160700.768753
1003_oxnn2_BSC48379.79.200603160700.768758
1003_oxnn2_ BSC48379. 200603130500.tar.gz for ANY or ALL of the files:
1003_oxnn2_BSC48379.1.200603130500.768741
1003_oxnn2_BSC48379.2.200603130500.768742
1003_oxnn2_BSC48379.72.200603130500.768753
1003_oxnn2_BSC48379.79.200603130500.768758
1004_oxnn3_BSC48380. 200603150510.tar.gz for ANY or ALL of the files:
1004_oxnn3_BSC48380.1.200603150510.768759
1004_oxnn3_BSC48380.2.200603150510.768760
1004_oxnn3_BSC48380.72.200603150510.768761
1004_oxnn3_BSC48380.79.200603150510.768762
Below is the output captured from the script execution:
DATE is Mon Mar 27 22:12:34 MYT 2006
file is 1003_oxnn2_BSC48379.1.200603130507.768741
GID is 768741
COUNTER_GRP is 1
HERE ....
FDAY 27
FHOUR 19
FMINUTE 40
CDAY is 27
CHOUR is 22
CMINUTE is 12
TOTAL_HOURS away is 3
TOTAL_MINUTES is 152
TOTAL_MINUTES exceeds TIMEOUT
FILES MISSING OR DID NOT ARRIVE AFTER TIMEOUT
ERROR_TYPE is MISSING_FILES
flogName is NOKIA-RAN.RAN.MISSING_FILES.200603272212.err
ERROR_TYPE is TIMEOUT_EXPIRED
flogName is NOKIA-RAN.RAN.TIMEOUT_EXPIRED.200603272212.err
file is 1003_oxnn2_BSC48379.2.200603130513.768742
GID is 768742
COUNTER_GRP is 2
HERE ....
FDAY 27
FHOUR 21
FMINUTE 28
CDAY is 27
CHOUR is 22
CMINUTE is 12
TOTAL_HOURS away is 1
TOTAL_MINUTES is 44
TOTAL_MINUTES exceeds TIMEOUT
FILES MISSING OR DID NOT ARRIVE AFTER TIMEOUT
ERROR_TYPE is MISSING_FILES
flogName is NOKIA-RAN.RAN.MISSING_FILES.200603272212.err
ERROR_TYPE is TIMEOUT_EXPIRED
flogName is NOKIA-RAN.RAN.TIMEOUT_EXPIRED.200603272212.err
file is 1003_oxnn2_BSC48379.72.200603130511.768753
GID is 768753
COUNTER_GRP is 72
HERE ....
FDAY 27
FHOUR 21
FMINUTE 26
CDAY is 27
CHOUR is 22
CMINUTE is 12
TOTAL_HOURS away is 1
TOTAL_MINUTES is 46
TOTAL_MINUTES exceeds TIMEOUT
FILES MISSING OR DID NOT ARRIVE AFTER TIMEOUT
ERROR_TYPE is MISSING_FILES
flogName is NOKIA-RAN.RAN.MISSING_FILES.200603272212.err
ERROR_TYPE is TIMEOUT_EXPIRED
flogName is NOKIA-RAN.RAN.TIMEOUT_EXPIRED.200603272212.err
file is 1003_oxnn2_BSC48379.79.200603130512.768758
GID is 768758
COUNTER_GRP is 79
HERE ....
FDAY 27
FHOUR 21
FMINUTE 28
CDAY is 27
CHOUR is 22
CMINUTE is 12
TOTAL_HOURS away is 1
TOTAL_MINUTES is 44
TOTAL_MINUTES exceeds TIMEOUT
NUM_FILES IN DIR EQUALS NUMBER OF EXPECTED FILES
ERROR_TYPE is TIMEOUT_EXPIRED
flogName is NOKIA-RAN.RAN.TIMEOUT_EXPIRED.200603272212.err
file is 1004_oxnn3_BSC48380.1.200603130507.768741
GID is 768741
COUNTER_GRP is 1
HERE ....
FDAY 27
FHOUR 21
FMINUTE 31
CDAY is 27
CHOUR is 22
CMINUTE is 12
TOTAL_HOURS away is 1
TOTAL_MINUTES is 41
TOTAL_MINUTES exceeds TIMEOUT
FILES MISSING OR DID NOT ARRIVE AFTER TIMEOUT
ERROR_TYPE is MISSING_FILES
flogName is NOKIA-RAN.RAN.MISSING_FILES.200603272212.err
ERROR_TYPE is TIMEOUT_EXPIRED
flogName is NOKIA-RAN.RAN.TIMEOUT_EXPIRED.200603272212.err
file is 1004_oxnn3_BSC48380.2.200603130507.768741
GID is 768741
COUNTER_GRP is 2
HERE ....
FDAY 27
FHOUR 21
FMINUTE 31
CDAY is 27
CHOUR is 22
CMINUTE is 12
TOTAL_HOURS away is 1
TOTAL_MINUTES is 41
TOTAL_MINUTES exceeds TIMEOUT
FILES MISSING OR DID NOT ARRIVE AFTER TIMEOUT
ERROR_TYPE is MISSING_FILES
flogName is NOKIA-RAN.RAN.MISSING_FILES.200603272212.err
ERROR_TYPE is TIMEOUT_EXPIRED
flogName is NOKIA-RAN.RAN.TIMEOUT_EXPIRED.200603272212.err
/tmp/osspkg2s/uma/umabass/cs_bsspm_aif/client/out/1003_oxnn2_BSC48379.1.2006031 30507.768741 /tmp/osspkg2s/uma/umabass/cs_bsspm_aif/client/out/1003_oxnn2_BSC48379.2.2006031 30513.768742 /tmp/osspkg2s/uma/umabass/cs_bsspm_aif/client/out/1003_oxnn2_BSC48379.72.200603 130511.768753 /tmp/osspkg2s/uma/umabass/cs_bsspm_aif/client/out/1003_oxnn2_BSC48379.79.200603 130512.768758 /tmp/osspkg2s/uma/umabass/cs_bsspm_aif/client/out/1004_oxnn3_BSC48380.1.2006031 30507.768741 /tmp/osspkg2s/uma/umabass/cs_bsspm_aif/client/out/1004_oxnn3_BSC48380.2.2006031 30507.768741
-n Inside tarAndZip_file() ...
TAR_FILE_NAME is 1004_oxnn3_BSC48380.200603130507.tar
a /tmp/osspkg2s/uma/umabass/cs_bsspm_aif/client/out/1004_oxnn3_BSC48380.2.2006031 30507.768741 1K
The script is provided as follows:
#!/bin/sh
# Set the path for the Unix executables
AWK="/usr/bin/awk"
TAR="/usr/bin/tar"
GZIP="/usr/bin/gzip"
GREP="/usr/xpg4/bin/grep"
VENDOR_TECH="NOKIA-RAN"
FTYPE="RAN"
BASE_DIR="/tmp/osspkg2s/uma/umabass/cs_bsspm_aif/client"
INPUT_DIR="$BASE_DIR/in"
OUTPUT_DIR="$BASE_DIR/out"
TEMP_DIR="$BASE_DIR/tmp"
LOG_DIR="$BASE_DIR/log"
DATE=`date`
echo DATE is $DATE
if [ ! -d "$INPUT_DIR" -o ! -d "$OUTPUT_DIR" -o ! -d "$TEMP_DIR" -o ! -d "$LOG_DIR" ] ; then
echo "Directories missing. Please check if input, output, temp and log directories are created."
exit 1
fi
# Timeout in minutes to wait the missing file
TIMEOUT=5
TOTAL_MINUTES=0
# Function to check the timeout for generating the TAR, even
# if a file is still missing
check_timeout() {
# File date/time
## FDAY, FHOUR and FMINUTE values are being passed by arguments into check_timeout
FDAY=$1
FHOUR=$2
FMINUTE=$3
echo FDAY $FDAY
echo FHOUR $FHOUR
echo FMINUTE $FMINUTE
# Current date/time
CDAY=`date +%d`
CHOUR=`date +%H`
CMINUTE=`date +%M`
echo CDAY is $CDAY
echo CHOUR is $CHOUR
echo CMINUTE is $CMINUTE
E_DAYS=`expr $CDAY - $FDAY`
#echo E_DAYS is $E_DAYS
E_HOURS=`expr $CHOUR - $FHOUR`
#echo E_HOURS is $E_HOURS
E_MINS=`expr $CMINUTE - $FMINUTE`
#echo E_MINS is $E_MINS
if [ $E_DAYS -gt 0 ] ; then
echo E_DAYS greater than 0
TOTAL_HOURS=`expr $E_DAYS \* 24 + $E_HOURS`
echo TOTAL_HOURS here is $TOTAL_HOURS
elif [ $E_DAYS -lt 0 ] ; then
echo E_DAYS less than 0
TOTAL_HOURS=`expr 24 + $E_HOURS`
echo TOTAL_HOURS there is $TOTAL_HOURS
else
#echo TOTAL_HOURS is E_HOURS
TOTAL_HOURS=$E_HOURS
echo TOTAL_HOURS away is $TOTAL_HOURS
fi
TOTAL_MINUTES=`expr $TOTAL_HOURS \* 60 + $E_MINS`
echo TOTAL_MINUTES is $TOTAL_MINUTES
}
# Error messages
MSG_TIMEOUT_EXPIRED="TIMEOUT waiting files for group has expired."
MSG_MISSING_FILES="The number of files is lower than expected. It will wait until TIMEOUT expires."
MSG_INVALID_FILE_NAME="The input filename is invalid. TAR file cannot be generated for this file."
MSG_PRG_EXECUTION_FAILED="The execution of this program has failed. Files have been moved to the TEMP directory."
# Result flag to check if it is ok to TAR the files
RESULT=""
ALL_PRESENT=""
NUM_FILES=0
NUM_EXPECTED_FILES=4
cd $INPUT_DIR
## file format: <OSSInstance>_<OSSName>_BSC<Nokia BSC ID>.<counter no>.<YYYYMMDDHHMM>.<gid>
NOKIA_BSC_ID=""
OSS_INSTANCE=""
OSS_NAME=""
TIMESTAMP=""
# Function to create the error log file
create_log() {
TS=`date +%Y%m%d%H%M`
#ID=$1
ERROR_TYPE=$2
echo ERROR_TYPE is $ERROR_TYPE
case $ERROR_TYPE in
TIMEOUT_EXPIRED) MSG=$MSG_TIMEOUT_EXPIRED ;;
MISSING_FILES) MSG=$MSG_MISSING_FILES ;;
INVALID_FILE_NAME) MSG=$MSG_INVALID_FILE_NAME ;;
PRG_EXECUTION_FAILED) MSG=$MSG_PRG_EXECUTION_FAILED ;;
esac
#flogname="$VENDOR_TECH.$FTYPE.$ERROR_TYPE.$ID.$TS.err"
flogname="$VENDOR_TECH.$FTYPE.$ERROR_TYPE.$TS.err"
echo flogName is $flogname
#echo "[`date`] ID: $ID - $MSG" > $TEMP_DIR/$flogname;
}
tarAndZip_file()
{
#cd $OUTPUT_DIR
cd $TEMP_DIR
OSS_INSTANCE=$1
OSS_NAME=$2
NOKIA_BSC_ID=$3
TIMESTAMP=$4
echo -n "Inside tarAndZip_file() ..."
TAR_FILE_NAME=${OSS_INSTANCE}_${OSS_NAME}_"BSC"${NOKIA_BSC_ID}.${TIMESTAMP}.tar
echo TAR_FILE_NAME is $TAR_FILE_NAME
$TAR -cvf $TAR_FILE_NAME $OUTPUT_DIR/${OSS_INSTANCE}_${OSS_NAME}_${BSC_VAL}.${COUNTER_GRP}.${TIMESTAMP}. ${GID}
$GZIP $TAR_FILE_NAME
#echo $TAR_FILE_NAME is now $tarZipFile
}
## Loop all files in the directory to check if there's a match
for file in `ls $INPUT_DIR | $GREP -E '^[a-zA-Z0-9]+_[a-zA-Z0-9]+_[a-zA-Z0-9]+\.(1|2|72|79)\.[0-9]{12}\.[0-9]'`
do
echo file is $file
if [ -f $file ];then
OSS_INSTANCE=`echo $file | awk 'BEGIN { FS = "_"} {print $1}'` ##1003
#echo OSS_INSTANCE $OSS_INSTANCE
OSS_NAME=`echo $file | awk 'BEGIN { FS = "_"} {print $2}'`## oxnn2
#echo OSS_NAME $OSS_NAME
TMP_VAR=`echo $file | awk 'BEGIN { FS = "_"} {print $3}'`
#echo TMP_VAR $TMP_VAR
BSC_VAL=`echo $TMP_VAR |awk 'BEGIN { FS = "."}{print $1}'` ## BSC34565
#echo BSC_VAL is $BSC_VAL
NOKIA_BSC_ID=`echo $BSC_VAL|cut -c4-`
#echo NOKIA_BSC_ID is $NOKIA_BSC_ID
TIMESTAMP=`echo $TMP_VAR |awk 'BEGIN { FS = "."}{print $3}'` ##200603130500
#echo TIMESTAMP is $TIMESTAMP
GID=`echo $TMP_VAR |awk 'BEGIN { FS = "."}{print $4}'`
echo GID is $GID
COUNTER_GRP=`echo $TMP_VAR |awk 'BEGIN { FS = "."}{print $2}'`
echo COUNTER_GRP is $COUNTER_GRP
GRP_DAY=`ls -l $file | $AWK 'BEGIN { FS = " "} { print $7 }'`
#echo GRP_DAY $GRP_DAY
hrmin=`ls -l $file | $AWK 'BEGIN { FS = " "} { print $8 }'` ##11:18
#echo hrmin $hrmin
GRP_HOUR=`echo $hrmin | $AWK 'BEGIN { FS = ":"} { print $1 }'`
#echo GRP_HOUR $GRP_HOUR
GRP_MIN=`echo $hrmin | $AWK 'BEGIN { FS = ":"} { print $2 }'`
#echo GRP_MIN $GRP_MIN
NUM_FILES=`expr $NUM_FILES + 1`
#echo NUM_FILES now is $NUM_FILES
if [ -z $OSS_INSTANCE -o -z $OSS_NAME -o -z $NOKIA_BSC_ID -o -z $TIMESTAMP ];then
create_log "$file" "INVALID_FILE_NAME"
rm $file
continue
else
## valid filename. Therefore, a potential candidate file to be tar-ed & gz-ed
## Check if all files are in the dir. Yes - check timeout period
#if [ `ls ${OSS_INSTANCE}_${OSS_NAME}_${BSC_VAL}.*.${TIMESTAMP}.* |wc -l` -eq 4 ];then
#TMP_NUM_FILES=`ls ${OSS_INSTANCE}_${OSS_NAME}_${BSC_VAL}.*.${TIMESTAMP}.* |wc -l`
#if [ $TMP_NUM_FILES -eq 4 ];then
echo "HERE ...."
RESULT="OK"
check_timeout $GRP_DAY $GRP_HOUR $GRP_MIN
if [ $TOTAL_MINUTES -ge $TIMEOUT ] ; then
echo TOTAL_MINUTES exceeds TIMEOUT
#cp $file $TEMP_DIR
if [ $NUM_FILES -eq $NUM_EXPECTED_FILES ] ; then
ALL_PRESENT="OK"
echo "NUM_FILES IN DIR EQUALS NUMBER OF EXPECTED FILES"
### NEED TO COPY ALL FILES, NOT JUST 1 FILE!!!
cp $file $OUTPUT_DIR
#tarAndZip_file $OSS_INSTANCE $OSS_NAME $NOKIA_BSC_ID $TIMESTAMP
else
echo FILES MISSING OR DID NOT ARRIVE AFTER TIMEOUT
create_log "$file" "MISSING_FILES"
cp $file $OUTPUT_DIR
#tarAndZip_file $OSS_INSTANCE $OSS_NAME $NOKIA_BSC_ID $TIMESTAMP
fi
create_log "$file" "TIMEOUT_EXPIRED"
## not timeout yet, check if all 4 files have arrived
## if all 4 files arrived, tar & zip them all up.
else
echo TOTAL_MINUTES below TIMEOUT
#create_log "$file" "MISSING_FILES"
if [ $NUM_FILES -eq $NUM_EXPECTED_FILES ] ; then
ALL_PRESENT="OK"
echo "NUM_FILES IN DIR EQUALS NUMBER OF EXPECTED FILES"
cp $file $OUTPUT_DIR
#tarAndZip_file $OSS_INSTANCE $OSS_NAME $NOKIA_BSC_ID $TIMESTAMP
break
fi
fi
fi
#fi
else
continue
fi
done
TAR_FILES=`find $OUTPUT_DIR -type f`
echo $TAR_FILES
if [ -n $TAR_FILES ] -a [ [$TOTAL_MINUTES -ge $TIMEOUT ] -o [ $NUM_FILES -eq $NUM_EXPECTED_FILES] ] ;then
tarAndZip_file $OSS_INSTANCE $OSS_NAME $NOKIA_BSC_ID $TIMESTAMP
else
echo NO FILES COLLECTED. NOTHING TO TAR & ZIP
fi
Could anyone kindly help me out with this problem by pointing out where did I go wrong? Greatly appreciate any examples.
Thanks
Danny

