Buscar y reemplazar o eliminar HTMLetiqueta usando sed en Linux

Question

Suponiendo que el fragmento del documento es parte de un archivo XHTML bien formado, puede eliminar todos lilos nodos que contengan un anodo con un hrefatributo cuyo valor sea exactamente https://forward.global.ssl.fastly.net/contributoragreements/el siguiente xmlstarlet:

xmlstarlet ed --delete '//li[a/@href = "https://forward.global.ssl.fastly.net/contributoragreements/"]' file.xhtml

Si el documento no es un documento XHTML bien formado, puedes intentar recuperarlo primero:

xmlstarlet fo --recover --html file.html |
xmlstarlet ed --delete '//li[a/@href = "https://forward.global.ssl.fastly.net/contributoragreements/"]'

Para ejecutar esto en todos index.htmllos archivos en una estructura de directorios podrida en top-dir, llame xmlstarletdesde findasí:

find top-dir -type f -name index.html -exec sh -c '
    tmpfile=$(mktemp)
    for pathname do
        cp "$pathname" "$tmpfile"
        xmlstarlet fo --recover --html "$tmpfile" |
        xmlstarlet ed --delete "//li[a/@href = \"https://forward.global.ssl.fastly.net/contributoragreements/\"]" >"$pathname.new"
    done
    rm -f "$tmpfile"' sh {} +

Lo anterior crearía un nuevo archivo llamado index.html.newpara cada index.htmlarchivo encontrado. Debes mirar estos archivos y decidir si se ven bien antes de ejecutarlos con .newel comando anterior eliminado.

Obviamente deberías ejecutar esto en unCopiarde sus datos respaldados durante la prueba.

Answer 1