Download .html and .cgi pages?
Question: I would like to save copies of threads from a forum which has .html and .cgi pages. I could use perl utility wget, and while I am aware there is a windows version I reviewed the 12 steps to make the utility possible and don't consider it "user friendly". Nor do I want to install linux and use wget from a linux box. In the past I have used freeware programs and they were cumbersome and didn't target the files I asked for. Because these threads are in .html and .cgi (perl 5) and I do not have access to a webserver in order to use PHP/Curl commans fopen() fgets() file() http_get() and so forth. I tried to create a .bat and .cmd file with those tags and it just flashed a cmd screen. Because these threads are a flat file database I'd like to use a kiddie script without going so far as to compile and later install the program. Also I don't have access to a webserver either. Been searching for answers for weeks and I am far more confused than when I started. That's odd because with the freeware program I was able to download .cgi files, not always the files I wanted to download. *sigh* Colanth & Schimdty, Any chance either of you can answer the question? I don't want to use a freeware program to download the [b] WEB PAGES [/b] because I inadertingly downloaded system files, private messages, and Log files with MD5 hashes that can be converted to plain text passwords. I tried to use a screenscraper to yank the html but on the cgi pages it again downloaded (saved) sensitive information. If you cannot provide possible solutions why bother responding? I want to target specific pagesin order to bypass file structures which link to system files. Martinth, Rather aggressive with no actual constructive suggestions. I have been searching for weeks. A earlier half of this .html forum with a flat file database has been put up in internet archive. But the html tags on the thread pages are the same but sitting on a UBB.classic forum with .cgi extentions which explains the need for perl. UBB won't help with converters unless you upgrade to UBB.threads. I noted I could use a window's version of wget in 12 not so simple steps: http://www.christopherlewis.com/WGet/WGetFiles.htm And here he is a somewhat plausable VBScript for the task: http://www.example-code.com/vbscript/mht_extractHtmlObjects1.asp And the option of looking into a converter which will take ubb.classic threads and convert them into phpbb. I have asked on numerous forums over the last few weeks and no sucessful converters have surfaced. So how about some viable options? Any clearer now?
Best Answers: Download .html and .cgi pages?
Unfortunately you can't get a hold of the cgi files for security reasons (unless the owner of the site or someone working there gave you ftp access). The reason for this is so that people can't look at the code and figure out how to hack the website. There might be a few hacker tools available but they are usually very tedious to use simply because they are trying to perform an action that the web server does not want it to perform. Not that I would ever consider hacking, cough cough. It's illegal.
Changing an .html file to "Open" with notepad will prevent your browser from being invoked. You want to change the "Edit" operation. Assuming you're using Windows XP: 1. In Windows Explorer (My Computer), click the Tools->Folder Options menu item. 2. Click the File Types tab. 3. Scroll down to the HTML file extension and select it. 4. Click the Advanced button. 5. Select Edit in the Actions list box and click the Edit button. 6. In the Application Used to Perform Action text box, substitute the path to Notepad. Make sure the entire path is enclosed in double quotes. 7. You might want to turn off the Use DDE checkbox. 8. OK out of everything. Hope that helps.
Don't complain about our answers; your "question" is rambling and vague! You spent WEEKS already? Why don't you just use your browser and save the pages manually? That wouldn't have taken WEEKS!?! The fact that you don't know how to solve the "flashing cmd screen" indicates to me that you are in WAY over your head. "Log files" and "system files" have NOTHING to do with html pages. Please clarify EXACTLY what you want to do. (A URL for us to look at would help IMMENSELY)
Single space 12 pt: two typed to one printed. Double space 12 pt: one typed to one printed. In general. Just so you know, publishers and editors think in word count, not page numbers. When they get a manuscript, they take the word count, and, by knowing the number of words per page for the different layouts they have available (hardcover, large soft cover, supermarket paperback) they figure how many pages the final book will be. Doing this will average out pages with lots of dialog and therefore lots of white space. The shortest novels you'll see on the shelf are about 40,000 words, or just over 150 pages. This is for the Young Adult market. Adult books are usually 60,000 to 100,000, but of course some are very much longer (but usually not over 200,000 words - about the size of Harry Potter and the Deathly Hallows). More info than you need at this point, but the more you know about how publishing works the better off you'll be. Have fun writing!
You can save the html. You can't "save" the cgi without ftp access to the site, because all you see of it is the html it creates. And you can't "see" the database without ftp access either.
Microsoft applications carry meta data when you copy and paste links (or table cells) - if you are looking to put the link in a Microsoft Word document, Outlook e-mail message, et cetera, you should be able to copy it from its source in Internet Explorer. Non-Microsoft applications generally do not support this functionality. (Could be considered a good thing or a bad thing, depending upon what you intend to do and how you feel about large amounts of meta data becoming embedded in your documents)
If you have your own answer to the question Download .html and .cgi pages?, then you can write your own version, using the form below for an extended answer.