tar excludes files too late to stop hardlink detection
Harald van Dijk
harald at gigawatt.nl
Sat Jul 24 18:22:58 UTC 2021
On 27/06/2021 15:15, Harald van Dijk wrote:
> On 26/06/2021 00:36, Harald van Dijk wrote:
>> Hi,
>>
>> tar --exclude results in bad archives when hardlinks are used.
>> Consider the following:
>>
>> $ mkdir tartest
>> $ echo hello > tartest/a
>> $ ln tartest/a tartest/b
>> $ busybox tar cf - tartest | tar tvf -
>> drwxr-xr-x harald/harald 0 2021-06-26 00:25 tartest/
>> -rw-r--r-- harald/harald 6 2021-06-26 00:25 tartest/b
>> hrw-r--r-- harald/harald 0 2021-06-26 00:25 tartest/a link to
>> tartest/b
>>
>> This is okay. tar may either pick up a first and then detect b as a
>> hardlink to a, or pick up b first and then detect a as a hardlink to
>> b. On my system, it picks up b first. You can adjust the below
>> accordingly if on your system a is picked up first. Now, exclude b:
>>
>> $ busybox tar cf - --exclude=b tartest | tar tvf -
>> drwxr-xr-x harald/harald 0 2021-06-26 00:25 tartest/
>> hrw-r--r-- harald/harald 0 2021-06-26 00:25 tartest/a link to
>> tartest/b
>>
>> This resulted in an archive where the contents of tartest/a are
>> missing. Extracting the archive results in an attempt to hardlink
>> tartest/b, which may or may not exist in the target directory. GNU tar
>> does not do this, it stores the contents of the file instead, which
>> seems like a better idea to me. Can busybox be modified to do that as
>> well?
>>
>> Tested with busybox 1.33.1.
>
> It seems like the fix is trivial, please see attached patch.
ping
> Cheers,
> Harald van Dijk
More information about the busybox
mailing list