如何讓`wget`使用GET方法來檢索頁面必要的內容?

如何讓`wget`使用GET方法來檢索頁面必要的內容?

我有一個簡單的命令來獲取登入頁面及其所有依賴項:

wget --post-data='user=user&password=password' --page-requisites https://…/login

伺服器日誌顯示以下內容(出於明顯原因而縮寫):

  1. 發布/登入 302
  2. 取得/帳戶200
  3. POST /robots.txt 200(應該是GET,但它成功了,所以沒問題)
  4. POST /favicon.ico 200(同上)
  5. POST /[looong PageSpeed URL]500(針對頁面上的每個 CSS、JavaScript 和圖片檔案)

取得這些檔案運作正常,因此 URL 是正確的,但 PageSpeed 似乎不喜歡客戶端 POSTing。如何將wgetGET 用於初始請求之外的所有內容?

使用 GNU Wget 1.18。


更新:漏洞已提交。

答案1

來自“man wget”:

           This example shows how to log in to a server using POST and then proceed to download the desired pages, presumably only accessible to authorized
       users:

               # Log in to the server.  This can be done only once.
               wget --save-cookies cookies.txt \
                    --post-data 'user=foo&password=bar' \
                    http://example.com/auth.php

               # Now grab the page or pages we care about.
               wget --load-cookies cookies.txt \
                    -p http://example.com/interesting/article.php

       If the server is using session cookies to track user authentication, the above will not work because --save-cookies will not save them (and neither
       will browsers) and the cookies.txt file will be empty.  In that case use --keep-session-cookies along with --save-cookies to force saving of session
       cookies.

相關內容