:wq - blog » wget http://writequit.org/blog Tu fui, ego eris Mon, 22 Dec 2014 14:54:59 +0000 en-US hourly 1 http://wordpress.org/?v=4.1.5 A completely useless but fun-to-write program for checking webpage existence http://writequit.org/blog/2007/12/14/a-completely-useless-but-fun-to-write-program-for-checking-webpage-existence/ http://writequit.org/blog/2007/12/14/a-completely-useless-but-fun-to-write-program-for-checking-webpage-existence/#comments Fri, 14 Dec 2007 23:09:02 +0000 http://writequit.org/blog/?p=108 Code:

#!/usr/bin/env ruby
def fisher_yates_shuffle(a)
(a.size-1).downto(1) { |i|
j = rand(i+1)
a[i], a[j] = a[j], a[i] if i != j
}
end
lines = File.open('/usr/share/dict/words').collect
fisher_yates_shuffle(lines)
lines.each {
|word|
puts "trying #{word.chomp}..."
system("wget -q #{ARGV[0]}/#{word.chomp}.html")
system("wget -q #{ARGV[0]}/#{word.chomp}.htm")
system("wget -q #{ARGV[0]}/#{word.chomp}.php")
sleep(1)
}

(The “sleep(1)” is so you don’t kill the server with traffic, remove if you like)

Should be pretty self-explainatory, go through all the words in /usr/share/dict/words and attempt to fetch webpages. At my current word dictionary size, it would take 65 hours to complete (1 second per word) :D

A smart person would replace “/usr/share/dict/words” in the script with a better list of website pagenames, if they actually wanted to use this :)

You know, I’ve always wondered if servers had a “rurigenous.html” or a “mastochondroma.php” webpage on their site…

]]>
http://writequit.org/blog/2007/12/14/a-completely-useless-but-fun-to-write-program-for-checking-webpage-existence/feed/ 0