Page 1 of 1

Batch file and command line aficionados

Posted: November 2nd, 2012, 3:00 pm
by Dchall_San_Antonio
I have a problem. I get about 30,000 documents per month in an oddball format. Here is an example
12345.001
12345.002
12346.001
12346.002
12346.003

The code is documentNumber.pageNumber. These are all TIFF files. If you change the extension to .tif and double click, any photo viewer software will open them. What I want them to be is PDF files where all the pages are in one file called 12345.pdf (with 2 pages) and 12346.pdf (with 3 pages). The 30,000 oddball documents should boil down to 10,000 pdf documents when the process is finished.

I had this identical problem in 2008 with the same stream of documents. Since then the provider changed to a pdf format but about a month ago they reverted to the prehistoric system. The provider cannot read the documents without an expensive reader provided by their contractor. My approach to the problem was to write a bat file to do the conversion. Unfortunately I don't know squat about writing batch files. All I knew was there had to be an approach that worked. Turned out I was right.

I went online to some forum and found some really helpful people. Here is the end result we (THEY) came up with.
@echo off & setlocal EnableExtensions ENableDelayedExpansion
set oldpath=%PATH%
set PATH=%PATH%;c:\bin;
set SRC=c:\Test\SRC\
set DST=c:\Test\DST\
:: Commands are only echoed until %DEB% ist set to nothing
set DBG=ECHO/
::set "DBG="
pushd %SRC%
for /F "tokens=*" %%A in ('dir /B/A-D/ONE "*.001"') do (
set "PG="
for /F "tokens=*" %%B in (
'dir /B/A-D/ONE "%%~nA.0*"') do set PG=!PG! %%~nxB
tiffcp -c lzw !PG! %DST%%%~nA.TIF
tiff2pdf -o %DST%%%~nA.PDF %DST%%%~nA.TIF
DEL %DST%%%~nA.TIF
)
POPD
set PATH=%oldpath%
That worked great in 2008. For some reason, it is not working today. You must do some set-up before using this file. You need a folder called c:/Test/SRC and one called c:/Test/DST. I have downloaded the tiffcp, tiff2pdf and the necessary .dll files and installed them. I put the .dll files into a c:/bin folder along with the tiffcp and tiff2pdf. When I run the bat from the command line prompt, I get a long series of error messages (I did in 2008, too). When the command prompt comes back there is a list of pdf files, properly numbered, in the DST folder. They are all 1 kb in size and they give an error when trying to open. Then I'll get a message that GNUWin32 has crashed. Then I'll get a message that tiff2pdf has crashed. I suspect there are .dll incompatibilities. I might have them in the wrong place or they might be different now than they were then. Or I might not have all I need.

Is there someone here at BL/ATY who can help me to understand what the batch file does? And can someone help me figure out what isn't working?

Re: Batch file and command line aficionados

Posted: November 2nd, 2012, 3:32 pm
by bpgreen
I'd have to see the errors to figure out what the problem is, but here's a link that might help.

Re: Batch file and command line aficionados

Posted: November 2nd, 2012, 3:33 pm
by bpgreen

Re: Batch file and command line aficionados

Posted: November 3rd, 2012, 12:16 am
by crabgrass
Sounds like it is scrolling away before you can see the offending error.

What is the exact command to invoke the batch file? We can redirect the output to a text file, so that we can see the exact error.

I would think something like "filename.bat > c:\output.txt" would generate a file c:\output.txt that would have the errors. Then just copy and paste the contents to this thread.

Re: Batch file and command line aficionados

Posted: November 3rd, 2012, 12:57 am
by bpgreen
Another comment is that newer versions of windows have much better scripting capabilities than DOS batch files. You can do things using WSH that used to be possible only in UNIX.

Do you know VB? WSH lets you write scripts in VBS, which is basically a dialect of VB.

Re: Batch file and command line aficionados

Posted: November 3rd, 2012, 10:20 am
by andy10917
David, I sent you a PM...

Re: Batch file and command line aficionados

Posted: November 3rd, 2012, 10:27 am
by turf_toes

Re: Batch file and command line aficionados

Posted: November 3rd, 2012, 10:09 pm
by Dchall_San_Antonio
Before you can convert a tiff file to pdf, your file has to have the tiff extension. Mine are starting out life with .001, .002, etc., extensions.
Before I even want to convert a tiff to pdf, I want all the pages concatenated, in order, into one tiff file with one distinct file name for the appropriate pages. The longest document I have seen recently is about 30 pages. The longest one I've ever seen was well over 500 pages. It would be interesting to run this algorithm on a really big document.

Google was my friend back in 2008 when I found tiff2pdf.exe and used it in that batch file.

The command I use to invoke the batch file is to double click on the batch file with the mouse pointer. If I want to see all the comments, I go to the command line, enter the directory where the batch file resides, and invoke the name of the file. I still get one error per page but the result is a successfully concatenated pdf file with the proper name.

Back in 2008 there were several different approaches tried using various languages. The batch file above was the only one that worked.

In breaking news: I tried running the same batch file on another computer and it worked. I am going to try reloading all the tiff software files and dlls from the working computer to the unworking computer and see what happens. It could be there is a corrupt file on the problem computer.

Thank you all for your interest in this.

Re: Batch file and command line aficionados

Posted: November 3rd, 2012, 10:59 pm
by turf_toes
I'm not a DOS guy (I use the unix/OS X terminal), but changing the file extension ought to be a trivial task. In Unix/Mac OS X, it would be something like this

Code: Select all

for f in *.tiff; do base=`basename $f .tiff`; mv $f $base.tif; done
 
I'm sure it's possible to write something equally simple in DOS.


Sent from my iOS device using the Yard Help App

Re: Batch file and command line aficionados

Posted: November 3rd, 2012, 11:23 pm
by crabgrass
Good news about the other PC.

What is the "one error per page" (explicitly)?

Re-load of DLL's, etc. certainly might work, but my hunch is a PATH issue...

Re: Batch file and command line aficionados

Posted: November 7th, 2012, 12:13 pm
by crabgrass
Dchall - sorry for the delay, lots going on. Not sure it this is of any value to you, but I documented your file, step-by-step:

:: Do NOT print out commands, enable Command Extensions, and
:: expand variables at execution time rather than at parse time.
:: Delay the execution/expansion of variables until we assign
:: the variables
:: This program will only run under "cmd" shell, not "command"
:: Only "cmd" supports 32-bit command extensions.
@echo off & setlocal EnableExtensions ENableDelayedExpansion

:: Create variable "oldpath" and load it with the current system PATH
set oldpath=%PATH%

:: Append "c:\bin" to system PATH
set PATH=%PATH%;c:\bin;

:: Create "SRC" and "DST" variables
set SRC=c:\Test\SRC\
set DST=c:\Test\DST\

:: Commands are only echoed until %DEB% ist set to nothing
:: Variable in comment above looks like a typo: %DEB% should be %DBG%,
:: presumably a debugger flag
set DBG=ECHO/

:: Intentionally commented out. Looks like either this or the above
:: command may be used. This is used to prevent anything from being
:: echoed
::set "DBG="

:: Switch to SRC directory
pushd %SRC%

:: Parse all items, tokenized by blank space delimiters,
:: one file at a time for all files with ".001" extension
:: This can be up to 61 tokens by using the ASCII codes from range:
:: ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^
:: _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z {
for /F "tokens=*" %%A in ('dir /B/A-D/ONE "*.001"') do
(
:: Initialize variable "PG" to NULL
set "PG="

:: Parse all items, tokenized by blank space delimiters,
:: one file at a time for all files with "%%~nA.0*" pattern
:: Basically "any character, any character, tilde, "nA.0", anything
for /F "tokens=*" %%B in
(
'dir /B/A-D/ONE "%%~nA.0*"'
) do

:: Set PG to contain multiple file names, not just one
set PG=!PG! %%~nxB

:: Use tiffcp to create multiple LZW-encrypted tiff's as per the
:: multiple file names in "PG"
tiffcp -c lzw !PG! %DST%%%~nA.TIF

:: Convert tiff's to PDF's
tiff2pdf -o %DST%%%~nA.PDF %DST%%%~nA.TIF

:: Delet tiff's
DEL %DST%%%~nA.TIF
)

POPD
:: Revert to previous directory

set PATH=%oldpath%
:: Revert to previous path

Re: Batch file and command line aficionados

Posted: November 20th, 2019, 2:06 pm
by Dchall_San_Antonio
I'm so glad I posted this many years ago. But, now, since I've slept so many times since then, I have a question for Crabgrass. I think I need more explanation about the following commented text.
:: Parse all items, tokenized by blank space delimiters,
:: one file at a time for all files with ".001" extension
:: This can be up to 61 tokens by using the ASCII codes from range:
:: ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^
:: _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z {