404 を何らかの方法で検出する

Question

どうですかwget？

3 つの例: 存在しないページへの 1 つ、ダウンロードが許可されていない既存のページへの 1 つ、そして機能するページへの 1 つです。

--2014-05-09 22:06:20--  https://askubuntu.com/testfor404
Resolving askubuntu.com (askubuntu.com)... 198.252.206.24
Connecting to askubuntu.com (askubuntu.com)|198.252.206.24|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2014-05-09 22:06:21 ERROR 404: Not Found.

wgethttps://askubuntu.com/reputation

--2014-05-09 22:07:11--  https://askubuntu.com/reputation
Resolving askubuntu.com (askubuntu.com)... 198.252.206.24
Connecting to askubuntu.com (askubuntu.com)|198.252.206.24|:80... connected.
HTTP request sent, awaiting response... 403 Forbidden
2014-05-09 22:07:11 ERROR 403: Forbidden.

wgethttp://askubuntu.com

--2014-05-09 22:07:36--  https://askubuntu.com/
Resolving askubuntu.com (askubuntu.com)... 198.252.206.24
Connecting to askubuntu.com (askubuntu.com)|198.252.206.24|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 69629 (68K) [text/html]
Saving to: ‘index.html’

100%[======================================>] 69.629       257KB/s   in 0,3s   

2014-05-09 22:07:36 (257 KB/s) - ‘index.html’ saved [69629/69629]

出力に「ERROR 404: Not Found」と表示される場合、そのコマンドは「true」または「false」を表示するように拡張できます。

このオプションは、--delete-afterダウンロード後に index.html を削除します。--spiderフラグは、ダウンロードせずにページヘッダー/ステータスをチェックします。

Answer 1