grep/awk를 통해 간단한 html 파일에서 값 추출

grep/awk를 통해 간단한 html 파일에서 값 추출

아래 첨부된 HTML 코드에 또 다른 grep/awk/sed 문제가 있습니다.

내부에 테이블이 포함된 간단한 HTML 파일이 제공됩니다. HTML은 스마트 계량기(가정용 에너지 계량기)에 의해 생성됩니다. html 테이블 안에는 나에게 중요한 두 가지 값인 Pplus와 Pminus가 표시됩니다. 이것은 GRID의 실제 전력과 태양광 발전소에서 나오는 실제 전력입니다.

오류 발생 가능성을 줄이기 위해 가능한 한 "안정적/안전한" 방식으로 이 두 값을 별도로 잡고 싶습니다. 내 이해는 html 구조가 결코 변하지 않는다는 것입니다. 값을 찾기 위한 출발점으로 18.000에너지 그리드의 현재 전력이 18W이고 0.00000현재 태양광 발전소(야간)에서 생산되는 0W를 의미합니다.

나로서는 올바른 위치를 잡는 데 도움이 될 수 있는 구조를 찾는 것이 거의 불가능합니다. 전문가의 견해와 이 작업이 가능하다면 정말 감사드립니다.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">
<HTML>
<HEAD>
<TITLE>NK-FW graphic by Lingg & Janke</TITLE>
<LINK REL="apple-touch-icon" HREF="/NKFW_icon57.png">
<META NAME="description" CONTENT="NK-FW graphic by Lingg & Janke">
<META HTTP-EQUIV="cache-control" CONTENT="no-cache">

<BASE TARGET="_top">

<STYLE TYPE="text/css">

BODY  {margin-left:0; margin-right:0; margin-top:0;}

A     {font-family:Arial; font-size:18px; font-weight:bold; color:#FFFFFF; }
TABLE {font-family:Arial; font-size:18px; font-weight:bold; color:#FFFFFF; }

INPUT {font-family:Arial; font-size:18px; font-weight:bold; color:#000000; }
SELECT{font-family:Arial; font-size:18px; font-weight:bold; color:#000000; }

.inputwidth  {width:190px;}
.keywidth    {width:190px;}

#idLJheadDiv {
               position:absolute;
               width:960px;
               height:50px;
               top:10px;
               left:0px;
               padding:0px;
               margin:0px;
               border:0px;
               background-color:#0074B2;
             }
#idLJheadTd1 {
               font-size:30px;
               font-family:Arial;
               font-weight:bold;
               font-style:normal;
               color:#FFFFFF;

               height:50px;
               text-align:left;
               vertical-align:middle;
             }
#idLJheadTd2 {
               font-size:20px;
               font-family:Arial;
               font-weight:bold;
               font-style:normal;
               color:#FFFFFF;

               height:50px;
               text-align:center;
               vertical-align:middle;
             }
#idLJheadTd3 {
               font-size:30px;
               font-family:Arial;
               font-weight:bold;
               font-style:italic;
               color:#FFFFFF;

               height:50px;
               text-align:right;
               vertical-align:middle;
             }


#idLJfootDiv {
               position:absolute;
               width:960px;
               height:50px;
               top:600px;
               left:0px;
               padding:0px;
               margin:0px;
               border:0px;
               background-color:#0074B2;
             }
#idLJfootTd  {
               font-size:30px;
               font-family:Arial;
               font-weight:bold;
               font-style:italic;
               color:#FFFFFF;

               height:50px;
               text-align:left;
               vertical-align:middle;
             }


#idButtonDiv {
               position:absolute;
               width:228px;
               height:50px;
               padding:0px;
               margin:0px;
               border:0px;
               background-color:#2f2f2f;
             }

#idButtonTd  {
               height:50px;
               vertical-align:middle;
             }

</STYLE>
</HEAD>

<!-- ************************** -->

<BODY SCROLL="auto"
      onResize="DoReposition();"
      onLoad="DoReposition();"
      BGCOLOR="#000000"
      TOPMARGIN=0
      LEFTMARGIN=0
      LINK=#ffffff
      VLINK=#ffffff
      ALINK=#ffffff >

<!-- ************************** -->

<SCRIPT LANGUAGE="JavaScript">
<!--
function HOffset()
{
   var window_width = window.innerWidth ? window.innerWidth : (document.body.clientWidth ? document.body.clientWidth : 0);
   return Math.max( 0, Math.floor( (window_width - 960) / 2 ) - 0 ).toString();
}

function VOffset()
{
   var window_height = window.innerHeight ? window.innerHeight : (document.body.clientHeight ? document.body.clientHeight : 0);
   return 0;
}

document.write( "<DIV ID='IDAlignPage' style='position:absolute; top:" + VOffset() + "px; left:" + HOffset() + "px;'>&nbsp;" );

function DoReposition()
 {var o='IDAlignPage';
   if(is_dom2&&document.getElementById(o))
    {var e=document.getElementById(o);e.style.left=HOffset()+'px';e.style.top=VOffset()+'px';}
   else if(is_ie&&is_major>=4&&eval('document.all.'+o))
    {var e=eval('document.all.'+o);e.style.left=HOffset()+'px';e.style.top=VOffset()+'px';}
   else if(is_nav&&is_major>=4&&eval('document.'+o))
    {var e=eval('document.'+o);e.left=HOffset();e.top=VOffset();}
 }

window.onresize=DoReposition;
window.onload=DoReposition;

//-->
</SCRIPT>








<!-- ************************** -->
<!-- *** upper + lower bar  *** -->

<DIV ID="idLJheadDiv">
  <TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0" WIDTH=100%><TR>
    <TD ID="idLJheadTd1">&nbsp;&nbsp;Smart Metering</TD>
    <TD ID="idLJheadTd2">
      SA
      &nbsp;&nbsp;
      21.06.2014
      &nbsp;&nbsp;
      21:57:02
      &nbsp;&nbsp;
      KW25
    </TD>
    <TD ID="idLJheadTd3">Lingg &amp; Janke&nbsp;&nbsp;</TD>
  </TR></TABLE>
</DIV>

<DIV ID="idLJfootDiv">
  <TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0" WIDTH=100%><TR>
    <TD ID="idLJfootTd">&nbsp;&nbsp;Energy Analyzer</TD>
  </TR></TABLE>
</DIV>


<!-- ************************** -->
<!-- *** 1. row *************** -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:78px; left:0px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
P&#043; in Watt

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:78px; left:244px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:78px; left:488px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:78px; left:732px;" ALIGN="CENTER" >
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- ************************** -->
<!-- *** 2. row *************** -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:143px; left:0px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
18.000

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:143px; left:244px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
<form action="g1.htm" method="GET">
<input type="submit" class="keywidth" name="A" value="MykWh">
</form>
</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:143px; left:488px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
<form action="g2.htm" method="GET">
<input type="submit" class="keywidth" name="A" value="Supply">
</form>
</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:143px; left:732px;" ALIGN="CENTER" >
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- ************************** -->
<!-- *** 3. row *************** -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:208px; left:0px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
P- in Watt

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:208px; left:244px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
<form action="g3.htm" method="GET">
<input type="submit" class="keywidth" name="A" value="G 3">
</form>
</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:208px; left:488px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
<form action="g4.htm" method="GET">
<input type="submit" class="keywidth" name="A" value="G 4">
</form>
</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:208px; left:732px;" ALIGN="CENTER" >
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- ************************** -->
<!-- *** 4. row *************** -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:273px; left:0px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
0.00000

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:273px; left:244px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
<form action="g5.htm" method="GET">
<input type="submit" class="keywidth" name="A" value="G 5">
</form>
</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:273px; left:488px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
<form action="g6.htm" method="GET">
<input type="submit" class="keywidth" name="A" value="G 6">
</form>
</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:273px; left:732px;" ALIGN="CENTER" >
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
P&#043;

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- ************************** -->
<!-- *** 5. row *************** -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:338px; left:0px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:338px; left:244px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
<form action="g7.htm" method="GET">
<input type="submit" class="keywidth" name="A" value="G 7">
</form>
</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:338px; left:488px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
<form action="g8.htm" method="GET">
<input type="submit" class="keywidth" name="A" value="G 8">
</form>
</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:338px; left:732px;" ALIGN="CENTER" >
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- ************************** -->
<!-- *** 6. row *************** -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:403px; left:0px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:403px; left:244px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:403px; left:488px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:403px; left:732px;" ALIGN="CENTER" >
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- ************************** -->
<!-- *** 7. row *************** -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:468px; left:0px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:468px; left:244px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:468px; left:488px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:468px; left:732px;" ALIGN="CENTER" >
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- ************************** -->
<!-- *** 8. row *************** -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:533px; left:0px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

<form action="/index.htm" method="GET">
<input type="submit" class="keywidth" value="ZURCK" name="A">
</form>

</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" STYLE="top:533px; left:244px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:533px; left:488px;" ALIGN="CENTER">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">
</TD></TR></TABLE>
</DIV>
<!-- -->

<!-- -->
<DIV ID="idButtonDiv" style="top:533px; left:732px;" ALIGN="CENTER" >
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"><TR><TD ID="idButtonTd">

<form action="/mainset/mainset.htm" method="GET">
<input type="submit" class="keywidth" value="EINRICHTEN" name="A">
</form>

</TD></TR></TABLE>
</DIV>
<!-- -->


<!-- ************************** -->

</BODY>
</HTML>

위에서 언급한 명령을 시도했지만 작동하지 않는 것 같습니다.

Python에 대한 아래 설명은 어떻게 작동할까요? 누군가 제게 도움을 주실 수 있나요? 선호하는 바는 없지만 모든 면(속도, 효율성 등)에서 최첨단으로 보이는 최상의 솔루션을 선호합니다.

게시된 HTML은 변경되지 않으며, 물론 값만 10초마다 업데이트됩니다.

답변1

> awk '/ID="idButtonTd"/ {printline=1; next;}; 
   printline==1 && /^[0-9]+\.[0-9]+$/ { print $0; }; { printline=0; }' file
18.000
0.00000

답변2

html 구조가 실제로 일정하다면 다음이 작동합니다.

totalValues=$(grep -A1 "idButtonTd" yourfile | grep -v "idButtonTd" | grep -v "\-\-" | grep "^[0-9][0-9]*")
Pplus=$(echo $totalValues | awk '{ print $1 }')
Pminus=$(echo $totalValues | awk '{ print $2 }')
echo "Pplus = $Pplus"
echo "Pminus = $Pminus"

관련 정보