Wouldn't it be easier to correct one of your scripts so that they handle filenames and directories containing spaces and run your script in your shell? Although, constructing the actual search and replace regexp is going to be a challenge if the script you want to strip is slightly different in each pages as you mentionned.

(Unless you want to remove all javascript, then it's nothing)
Also, I'm wondering why you're downloading the whole 7 gigs (including pictures) when you could download a tar containing only html files...
Entreri.