df と df -h が異なる値を表示するのはなぜですか? df -h はどのように計算を実行しますか?

2024-6-10 • tag-icon

disk-usage coreutils

df と df -h が異なる値を表示するのはなぜですか? df -h はどのように計算を実行しますか?

df -h は具体的にどのように機能しますか? を実行するとdf、次のようになります:

Filesystem     1K-blocks    Used Available Use% Mounted on
/dev/simfs      41943040 7659828  34283212  19% /

を実行するとdf -h、次のようになります。

Filesystem      Size  Used Avail Use% Mounted on
/dev/simfs       40G  7.4G   33G  19% /

問題は、どうすれば同じ数字が得られるかということです。

41943040 / 1024 / 1024 = 40 では、他の数を 1024 で割りましょう。

7659828 / 1024 / 1024 = 7,304981

じゃあ1000年までにかな？

7659828 / 1000 / 1000 = 7,659828

どうやっdf -hて7.4Gになったんですか？

34283212 / 1024 / 1024 = 32,695, which is ±33G

dfはオープンソースですが、私はクローンリポジトリにアクセスしてコードを確認しました。次のような結果が見つかりました:

for (col = 0; col < ncolumns; col++)
    {
      char *cell = NULL;
      char const *header = _(columns[col]->caption);

      if (columns[col]->field == SIZE_FIELD
          && (header_mode == DEFAULT_MODE
              || (header_mode == OUTPUT_MODE
                  && !(human_output_opts & human_autoscale))))
        {
          char buf[LONGEST_HUMAN_READABLE + 1];

          int opts = (human_suppress_point_zero
                      | human_autoscale | human_SI
                      | (human_output_opts
                         & (human_group_digits | human_base_1024 | human_B)));

          /* Prefer the base that makes the human-readable value more exact,
             if there is a difference.  */

          uintmax_t q1000 = output_block_size;
          uintmax_t q1024 = output_block_size;
          bool divisible_by_1000;
          bool divisible_by_1024;

          do
            {
              divisible_by_1000 = q1000 % 1000 == 0;  q1000 /= 1000;
              divisible_by_1024 = q1024 % 1024 == 0;  q1024 /= 1024;
            }
          while (divisible_by_1000 & divisible_by_1024);

          if (divisible_by_1000 < divisible_by_1024)
            opts |= human_base_1024;
          if (divisible_by_1024 < divisible_by_1000)
            opts &= ~human_base_1024;
          if (! (opts & human_base_1024))
            opts |= human_B;

          char *num = human_readable (output_block_size, buf, opts, 1, 1);

          /* Reset the header back to the default in OUTPUT_MODE.  */
          header = _("blocks");

          /* TRANSLATORS: this is the "1K-blocks" header in "df" output.  */
          if (asprintf (&cell, _("%s-%s"), num, header) == -1)
            cell = NULL;
        }
      else if (header_mode == POSIX_MODE && columns[col]->field == SIZE_FIELD)
        {
          char buf[INT_BUFSIZE_BOUND (uintmax_t)];
          char *num = umaxtostr (output_block_size, buf);

          /* TRANSLATORS: this is the "1024-blocks" header in "df -P".  */
          if (asprintf (&cell, _("%s-%s"), num, header) == -1)
            cell = NULL;
        }
      else
        cell = strdup (header);

      if (!cell)
        xalloc_die ();

      hide_problematic_chars (cell);

      table[nrows - 1][col] = cell;

      columns[col]->width = MAX (columns[col]->width, mbswidth (cell, 0));
    }

私はこの言語の経験はありませんが、私の理解では、各列の値が 1024 または 1000 で割り切れるかどうかをチェックし、-hオプションの値をレンダリングするのに適した値を選択しようとします。しかし、1000 で割っても 1024 で割っても同じ値になりません。なぜでしょうか?

理由はわかっていると思います。1000または1024で割るかどうかをチェックします。それぞれ分割。

          if (divisible_by_1000 < divisible_by_1024)
            opts |= human_base_1024;
          if (divisible_by_1024 < divisible_by_1000)
            opts &= ~human_base_1024;
          if (! (opts & human_base_1024))
            opts |= human_B;

7659828 / 1024 / 1024 = 7,304981を解いてみましょう。-h答えは次のようになりました。7.4G

7659828 / 1024 = 7480,xxx
7659828 / 1000 = 7659,xxx

7659 が 7480 より大きい場合は 1024 で割ります。

まだ大きな数字なので、続けましょう:

7659828 / 1024 / 1024 = 7,xxx  (7,3049..)
7659828 / 1024 / 1000 = 7,xxx  (7,4803..)

1000を計算すれば7.48になり、信じるコードのどこかで切り捨てが行われるため、「多いより少ない方が良い」ことになりますが、7.4G のデータは入力できますが、7.5G は入力できません。

33.4Gでも同じことが言えます

34283212 / 1024 / 1000 = 33.47...

つまり33Gになります。

答え1

投稿したコードは、最初の行のテキストを生成する関数「get_header」からのものです。あなたの場合、これは見出し「1K-blocks」に適用されます (df -B1023違いを確認するには呼び出してください)。

重要な注意: 「1K」は 1024 バイトのブロックを指し、1000 バイトのブロック (「1kB ブロック」で示されます。を参照df -B1000)ではありません。

人間が読める形式での数値の計算は、関数「human_readable」(human.c:153) によって処理されます。df.c:1571 には、フラグ付きで呼び出されるときに使用されるオプションがあります-h。

case 'h':
    human_output_opts = human_autoscale | human_SI | human_base_1024;
    output_block_size = 1;
    break;

すべての計算は、人間が読める形式 ("-h") で 1024 を基数として実行されます。示されている human_output_opts に加えて、ここで適用されるデフォルト設定があります (human.h、enum 宣言を参照)。

/* The following three options are mutually exclusive.  */
/* Round to plus infinity (default).  */
human_ceiling = 0,
/* Round to nearest, ties to even.  */
human_round_to_nearest = 1,
/* Round to minus infinity.  */
human_floor = 2,

human_output_opts には human_round_to_nearest または human_floor が含まれていないため、デフォルト値の human_ceiling が使用されます。したがって、計算された値はすべて切り上げられます。

設定を確認するには、次の 1K ブロックに基づいて人間が読める形式を計算してみますdf。

Size = ceil(41943040/1024/1024) = ceil(40) = 40
Used = ceil(7659828/1024/1024) = ceil(7.305) = 7.4
Available = ceil(34283212/1024/1024) = ceil(32.695) = 33

これはの出力と同じですdf -h。

(... 1000 バイト形式を希望する場合は、単にを呼び出すことができますdf -H)。

答え2

dfFreeBSD のプログラム (これがdf -h元々のソースです)もdfSolaris の実装も、このようには動作しません。

Solaris ソースはオープンソースなので、df自分の OS でコンパイルできるかどうかを確認できます。

関連情報