Page 1 of 1
Detecting Wget Connections
Posted: Thu Sep 06, 2007 12:34 am
by falcon2424
For one of my coding projects, I'm looking at how content changes on blogs and am using Wget to download some pages.
Obviously, some server admins aren't terribly happy about bots spidering their pages and have set up blocks. For example, I apparently tripped one over at slashdot and now have an IP ban up against my shell account's server. While that's not a particularly big deal, it does tell me that I've got at least error in my code, and I'm trying to think if there are any detection methods I haven't guarded against.
First of all, I've changed my user-agent string to be mozilla from a windows box. I've also set my recursion depth to 1, with no-clobber and am only going after either main-pages or section pages. This means that I shouldn't be, for example, loading a ton of user-profiles or other unusually used pages.
The next thing I've changed is that I've set my wait time to be a random value between 0 and 6 seconds (although, was an oversight and might have been how the ./ admins noticed my bot's activities).
Is there anything else that I'm forgetting that might let a server admin differentiate my scripts from a normal user?
Re: Detecting Wget Connections
Posted: Thu Sep 06, 2007 6:54 pm
by jack krauser
falcon2424 wrote:For one of my coding projects, I'm looking at how content changes on blogs and am using Wget to download some pages.
Obviously, some server admins aren't terribly happy about bots spidering their pages and have set up blocks. For example, I apparently tripped one over at slashdot and now have an IP ban up against my shell account's server. While that's not a particularly big deal, it does tell me that I've got at least error in my code, and I'm trying to think if there are any detection methods I haven't guarded against.
First of all, I've changed my user-agent string to be mozilla from a windows box. I've also set my recursion depth to 1, with no-clobber and am only going after either main-pages or section pages. This means that I shouldn't be, for example, loading a ton of user-profiles or other unusually used pages.
The next thing I've changed is that I've set my wait time to be a random value between 0 and 6 seconds (although, was an oversight and might have been how the ./ admins noticed my bot's activities).
Is there anything else that I'm forgetting that might let a server admin differentiate my scripts from a normal user?
sorry dude i don't know your programme

Re: Detecting Wget Connections
Posted: Fri Sep 07, 2007 1:20 am
by Captain Segfault
sorry dude i don't know your programme

You don't know wget? What kind of hacker are you? Any real hacker who hasn't been asleep for the past decade has at least heard of it.
You should check it out.
Posted: Fri Sep 07, 2007 3:45 am
by Hacksign
wget ...
emmmmmmm
it is of no use under windows
but usefull under linux/unix~~
Posted: Fri Sep 07, 2007 3:48 am
by falcon2424
Hacksign wrote:wget ...
emmmmmmm
it is of no use under windows
but usefull under linux/unix~~
Why isn't is useful in windows? You can certainly download and run a windows version.
Posted: Fri Sep 07, 2007 3:55 am
by Hacksign
but who will do this
so many so many people download files from internet by only clicking the link then begin to download in my place the use a tool called 'thunder'
no one use wget~
Posted: Fri Sep 07, 2007 4:20 am
by falcon2424
Hacksign wrote:but who will do this
so many so many people download files from internet by only clicking the link then begin to download in my place the use a tool called 'thunder'
no one use wget~
You can do the same thing in linux, when I want a single file, I might just download it from firefox.
Wget is used for downloading all of a website. For example, the command 'wget -r -l 0 hacker.org' will download hacker.org and every file linked to from the hacker.org server.
Also, you can use wget when you're downloading a really big file because wget can resume downloads that were disconnected. So, when I'm downloading my copy of a Ubuntu DVD and my internet connection dies 2GB in, I don't have to re-download the whole file.
Posted: Fri Sep 07, 2007 8:06 am
by Hacksign
yeah but when i download a single file i usually download it form the firefox
same with u

Re: Detecting Wget Connections
Posted: Fri Sep 07, 2007 8:31 am
by jack krauser
Captain Segfault wrote:sorry dude i don't know your programme

You don't know wget? What kind of hacker are you? Any real hacker who hasn't been asleep for the past decade has at least heard of it.
You should check it out.
i have windows and i don't use it

if i had linux maybe

Posted: Sun Sep 09, 2007 7:24 pm
by memesmith
of course you can use wget in windows. what with grep, sed, awk and the rest of the gnuwin32 ports it's damned useful
Posted: Sun Sep 09, 2007 8:37 pm
by jack krauser
memesmith wrote:of course you can use wget in windows. what with grep, sed, awk and the rest of the gnuwin32 ports it's damned useful
well i will try but...
