Script for finding XML errors
Script for finding XML errors
- Subject: Script for finding XML errors
- From: Nathan Hadfield <email@hidden>
- Date: Thu, 1 Nov 2007 17:16:45 -0600
In case it helps anyone else, here's a little bash script I wrote to
find HTML files with XML errors and to print summary totals. (It calls
tidy, which is included in Leopard.) Since we have a ton of files that
need to be validated for proper XML, I'm using this to get a high-
level view of how many files are OK and how many still have errors.
You use it like this:
xmlcheck [directory]
It will find all .html files under the directory and identify any with
XML formatting problems.
To see other options (show errors/warnings only, show verbose
messages), use
xmlcheck -h
for help.
-Nathan
(Source below. Also attached.)
Attachment:
xmlcheck
Description: Binary data
-----------------------< copy and paste >-----------------------
#!/bin/bash
# Uses 'tidy' to find XML errors in HTML files
export IFS=$'\n' # separate tokens by newline only (needed so 'find'
command works with filenames having spaces)
TOTAL_COUNT=0
ERR_COUNT=0
WARN_COUNT=0
SHOW_WARNINGS=1
SHOW_OK=1
VERBOSE=0
FILES=
while getopts 'ewvh' OPTION; do
case $OPTION in
e) SHOW_WARNINGS=0
SHOW_OK=0
;;
w) SHOW_WARNINGS=1
SHOW_OK=0
;;
v) VERBOSE=1
;;
h|?) printf "Usage: %s [-e|w] [-v] [-h] [root directory]\n\n" $
(basename $0) >&2
printf " Uses 'tidy' command to find XML formatting errors
in .html files.\n\n" >&2
printf " -e Show only files with errors\n" >&2
printf " -w Show only files with errors or warnings\n" >&2
printf " -v Verbose; show error and warning messages\n" >&2
printf " -h Help\n\n" >&2
exit 2
;;
esac
done
shift $(($OPTIND - 1))
FILES=$*
if [[ "$FILES" == "" ]]; then
FILES="."
fi
for i in $(find $FILES -path "*.html"); do
(( TOTAL_COUNT += 1 ))
TIDY_OUT=`tidy -xml -e -q $i 2>&1`
ERROR_CODE=$?
if [[ "$ERROR_CODE" == "2" ]]; then
(( ERR_COUNT += 1))
echo "ERRORS! $i" >&2
if [[ $VERBOSE == 1 ]]; then
echo $TIDY_OUT >&2
echo >&2
fi
elif [[ "$ERROR_CODE" == "1" ]]; then
(( WARN_COUNT += 1 ))
if [[ $SHOW_WARNINGS == 1 ]]; then
echo "WARNINGS! $i" >&2
if [[ $VERBOSE == 1 ]]; then
echo $TIDY_OUT >&2
echo >&2
fi
fi
else
if [[ $SHOW_OK == 1 ]]; then
echo "OK $i"
fi
fi
done
OK_COUNT=$(( TOTAL_COUNT - ERR_COUNT - WARN_COUNT ))
PERCENT_ERR=$(( ERR_COUNT * 100 / TOTAL_COUNT ))
PERCENT_WARN=$(( WARN_COUNT * 100 / TOTAL_COUNT ))
PERCENT_OK=$(( OK_COUNT * 100 / TOTAL_COUNT ))
echo
printf " Total files: M\n" $TOTAL_COUNT
echo
printf " OK files: M =%%\n" $OK_COUNT $PERCENT_OK
printf " Files with errors: M =%%\n" $ERR_COUNT $PERCENT_ERR
printf " Files with warnings only: M =%%\n" $WARN_COUNT
$PERCENT_WARN
echo
-----------------------< copy and paste >-----------------------
--
Nathan Hadfield
email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden