Tests for and removes UTF8 BOMs.
#!/bin/bash
for F in $1
do
if [[ -f $F && `head -c 3 $F` == $'\xef\xbb\xbf' ]]; then
# file exists and has UTF-8 BOM
mv $F $F.bak
tail -c +4 $F.bak > $F
echo "removed BOM from $F"
fi
done
USAGE: ./unbom *.txt
The magic is tail -c +4
which strips the first 3 bytes.
Good & thanks for sharing. You may want to change:
for F in $1
do
to something like:
while [[ x$1 != x ]]
do
F=”$1″
shift
Even more terse:
sed -i ‘1s/^\xEF\xBB\xBF//’ $1