data:image/s3,"s3://crabby-images/ea4f4/ea4f4629ad7c670492f2052b2aabb9b8fc8c0c5c" alt="Merge two binary image files by boolean OR (ddrescue output filename mistake)".png)
I made a silly mistake by using the wrong output filename when resuming a ddrescue. This is what happened:
ddrescue -b 2048 -d -v /dev/sr1 IDTa.img IDTa.ddrescue.log
Then the computer crashed and I mistakenly resumed with:
ddrescue -b 2048 -d -v /dev/sr1 IDTa.iso IDTa.ddrescue.log
I gather that both image files will start off all zeroed, so I guess that if I were to boolean OR both files together then the result would be what ddrescue would have output if I had not made the mistake?
The files are not continuations of one another (like How can I merge two ddrescue images?) since I had already run ddrescue -n
previously, which completed successfully. i.e. IDTa.img contains most of the data, IDTa.iso contains scattered blocks from all over the image (and those blocks would be zero in IDTa.img).
Is there a simple CLI way to do this? I could prob do this in C, but I'm very rusty! Also might be a nice first exercise in Python, which I've never got round to learning! Nevertheless, don't particularly want to reinvent the wheel if something out there already exists. Not too fussed about performance.
Update: (apologies if this is the wrong place to put a reply to an answer. The 'comment' option seems to be too allow too few characters, so I'm replying here!)
I have also tried ddrescue with '--fill-mode=?' as a solution to the above, but it did not work. This is what I did:
ddrescue --generate-mode -b 2048 -v /dev/sr1 IDTa.img IDTa.img.log
cp IDTa.img IDTa.img.backup
ddrescue '--fill-mode=?' -b 2048 -v IDTa.iso IDTa.img IDTa.img.log
To check, I looked for the first position that IDTa.iso has data:
hexdump -C IDTa.iso |less
the output was:
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
001da800 00 00 01 ba 21 00 79 f3 09 80 10 69 00 00 01 e0 |....!.y....i....|
...
001db000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
...
I looked up 001da800 in IDTa.img:
hexdump -C IDTa.img |less
/001da800
Output:
001da800 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
001db000 00 00 01 ba 21 00 7b 00 bf 80 10 69 00 00 01 e0 |....!.{....i....|
...
So, the data at position 001da800 has not copied over from file IDTa.iso to IDTa.img?
Checking IDTa.img.log:
# Mapfile. Created by GNU ddrescue version 1.22
# Command line: ddrescue --fill-mode=? -b 2048 -v IDTa.iso IDTa.img IDTa.img.log
# Start time: 2021-06-28 13:52:39
# Current time: 2021-06-28 13:52:46
# Finished
# current_pos current_status current_pass
0x299F2000 + 1
# pos size status
0x00000000 0x00008000 ?
0x00008000 0x001D2800 +
0x001DA800 0x00000800 ?
0x001DB000 0x00049000 +
...
and a reality check:
diff -q IDTa.img IDTa.img.backup
returns no difference.
Update 2:
@Kamil edited the solution (see below) by dropping the --fill-mode=?
argument. Appears to work!
Antwort1
I think this can be done with ddrescue
itself. You need --generate-mode
.
When
ddrescue
is invoked with the option--generate-mode
it operates in "generate mode", which is different from the default "rescue mode". That is, in "generate mode"ddrescue
does not rescue anything. It only tries to generate amapfile
for later use.[…]
ddrescue
can in some cases generate an approximatemapfile
, frominfile
and the (partial) copy inoutfile
, that is almost as good as an exactmapfile
. It makes this by simply assuming that sectors containing all zeros were not rescued.[…]
ddrescue --generate-mode infile outfile mapfile
(source)
Make copies of the two images, just in case. If your filesystem supports CoW-copy then use cp --reflink=always
for each image to make copies virtually instantly.
You need to make sure the two images are of equal size. If one of them is smaller then it should be enlarged, i.e. zeros (possibly sparse zeros) should be appended. This code will do this automatically (truncate
is required):
( f1=IDTa.img
f2=IDTa.iso
s1="$(wc -c <"$f1")"
s2="$(wc -c <"$f2")"
if [ "$s2" -gt "$s1" ]; then
truncate -s "$s2" "$f1"
else
truncate -s "$s1" "$f2"
fi
)
(I used a subshell so variables die with it and the main shell is unaffected.)
Now let the tool analyze your first image and find out which sectors were probably not rescued:
ddrescue --generate-mode -b 2048 -v /dev/sr1 IDTa.img new_mapfile
Note new_mapfile
here is a new file, not your IDTa.ddrescue.log
. Do not touch IDTa.ddrescue.log
.
After new_mapfile
is generated, lines in it should show status +
or ?
, depending on if the corresponding fragment was considered "rescued" or "non-tried".
Now you want to fill the allegedly "non-tried" block of IDTa.img
with data from IDTa.iso
. The next command will modify IDTa.img
.
Rescue the allegedly "non-tried" block of IDTa.img
by reading data from IDTa.iso
:
ddrescue -b 2048 -v IDTa.iso IDTa.img new_mapfile
Now the modified IDTa.img
along with the untouched IDTa.ddrescue.log
should be as good as if you didn't make the mistake.
Notes:
- It can have happened some sectors containing all zeros were actually rescued.
--generate-mode
will classify them as?
. They will be filled with data taken fromIDTa.iso
"in vain". This doesn't matter for the ultimate result because they are all zeros in this other file as well. - The result should be the same if you interchange
IDTa.iso
andIDTa.img
in the entire procedure (but keep in mind if you do this then the result will be inIDTa.iso
). So there's a choice. With--generate-mode
I would use the file from which I expect less sectors containing all zeros because this should minimize the amount of work for the last command. - The method works for regular files
IDTa.iso
andIDTa.img
. If instead any of them you had a block device, its "random" content from before your work withddrescue
would interfere and spoil the result (so there's no point in solving a potential problem with different sizes in the first place, wheretruncate
doesn't help). - I tested the procedure after replicating your mistake while trying to rescue a flakey device.