我正在編寫一個腳本,將註冊商資訊與網域 whois 分開。到目前為止它已經足夠工作了,但是我想刪除一些東西以使其更乾淨。它適用於大多數網域。這是我的程式碼:
#!/bin/bash
reg=$(whois "stackoverflow.com" | egrep -i 'Registrar|Sponsoring Registrar|Registrant|!internic')
printf "Below is my best attempt at finding the Registrar info:\n"
printf "$reg\n"
這是它的輸出:
Below is my best attempt at finding the Registrar info:
with many different competing registrars. Go to http://www.internic.net
Registrar: NAME.COM, INC.
Sponsoring Registrar IANA ID: 625
registrar's sponsorship of the domain name registration in the registry is
date of the domain name registrant's agreement with the sponsoring
registrar. Users may consult the sponsoring registrar's Whois database to
view the registrar's reported date of expiration for this registration.
Registrars.
我在 grep 中添加了一些偽代碼來嘗試排除字串“internnic”,以便剪掉第一行。我還想找到一種方法來刪除輔助“註冊商的贊助......”等。
是否可以檢測一個字串而不包含該行?謝謝
答案1
另一種選擇是更具體地說明您正在尋找的內容。例如:
whois stackoverflow.com | grep -E '^[[:space:]]*(Registr(ar|ant|y)|Sponsoring).*: '
這僅提取以“Registrar”、“Registrant”、“Registry”或“Sponsoring”之前的可選空格開頭的行,後跟任何數字(零個或多個)任何字符,後跟冒號和空格。
(順便說一句,這使用grep -E
而不是過時和已棄用的egrep
。它們做同樣的事情。)
輸出:
Registrar: NAME.COM, INC.
Sponsoring Registrar IANA ID: 625
Registry Domain ID: 108907621_DOMAIN_COM-VRSN
Registrar WHOIS Server: whois.name.com
Registrar URL: http://www.name.com
Registrar Registration Expiration Date: 2016-12-26T19:18:07Z
Registrar: Name.com, Inc.
Registrar IANA ID: 625
Registry Registrant ID:
Registrant Name: Sysadmin Team
Registrant Organization: Stack Exchange, Inc.
Registrant Street: 110 William St , Floor 28
Registrant City: New York
Registrant State/Province: NY
Registrant Postal Code: 10038
Registrant Country: US
Registrant Phone: +1.2122328280
Registrant Email: [email protected]
Registry Admin ID:
Registry Tech ID:
Registrar Abuse Contact Email: [email protected]
Registrar Abuse Contact Phone: +1.1 7203101849
順便說一句,在對來自慢速來源(例如資料庫查詢或來自whois 或http 伺服器等遠端來源)的文字測試任何形式的文字處理(包括正規表示式)時,執行一次慢速命令並將輸出重定向到文件,然後針對該文件進行測試。當您擁有所需的內容時,請確保它與直接管道傳輸(新鮮)資料的工作方式相同。
例如
whois stackoverflow.com > so.txt
與輸出有關的其他有用的事情whois
:
提取 whos 開頭的域塊(域行以 4 個空格開頭,以冒號結尾):
grep -Ei '^[[:blank:]]+.*:[[:blank:]]' so.txt
輸出:
Domain Name: STACKOVERFLOW.COM
Registrar: NAME.COM, INC.
Sponsoring Registrar IANA ID: 625
Whois Server: whois.name.com
Referral URL: http://www.name.com
Name Server: CF-DNS01.STACKOVERFLOW.COM
Name Server: CF-DNS02.STACKOVERFLOW.COM
Status: clientTransferProhibited https://icann.org/epp#clientTransferProhibited
Updated Date: 26-nov-2015
Creation Date: 26-dec-2003
Expiration Date: 26-dec-2016
提取註冊人區塊,以「網域名稱」欄位開頭,以「註冊商濫用聯絡電話」欄位結尾:
sed -n -e '/^Domain Name:/,/^Registrar Abuse Contact Phone:/p' so.txt
以上兩項加在一起:
sed -n -e '/^Domain Name:/,/^Registrar Abuse Contact Phone:/p /^[[:blank:]]+.*:[[:blank:]] /p'
上述所有內容的輸出都可以輕鬆地使用
awk
或任何其他可以使用冒號 (:
) 字元作為欄位分隔符號的文字處理工具進行進一步處理。
答案2
使用 -v 標誌:
reg=`whois stackoverflow.com | egrep -i 'Registrar|Sponsoring Registrar|Registrant' | grep -v internic`