* fix: resolve clippy warning in nightly
* wip: major rework of cde location
* wip: rework CDE lookup
* refactor: magic finder, eocd lookup retry
* wip: handle empty zips
* fix: satisfy tests, add documentation
* chore: remove unused dependencies
* feat: support both zip32 and zip64 comments
* feat: add zip64 comment functions to ZipWriter
* fix: first pass on maintainer comments
* fix: continue searching for EOCD when the central directory is invalid
* chore: satisfy clippy lints
* chore: satisfy style_and_docs
* feat: support both directions in MagicFinder, correctly find first CDFH
* fix: more checks to EOCD parsing, move comment size error from parse to write
* fix: use saturating add when checking eocd64 record_size upper bound
* fix: correctly handle mid window offsets in forward mode
* fix: compare maximum possible comment length against file size, not search region end
* feat: handle zip64 detection as a hint
* fix: detect oversized central directories when locating EOCD64
* fix: oopsie
---------
Signed-off-by: Chris Hennick <4961925+Pr0methean@users.noreply.github.com>
Co-authored-by: Chris Hennick <4961925+Pr0methean@users.noreply.github.com>
* CI: Add -Zminimal-versions job
* Bump anyhow dev-dep to fix build with -Zminimal-versions
* Relax dependency bounds
These relaxed bounds don't impact existing builds as they're all SemVer
compatible. Specifying lower bounds allows projects with dependencies
that pin
these to lower versions do build without version resolution conflicts.
* Cargo.toml: elide .0 patch versions
---------
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Chris Hennick <4961925+Pr0methean@users.noreply.github.com>
* Use the tempfile crate instead of the tempdir crate (which is deprecated)
https://github.com/rust-lang-deprecated/tempdir?tab=readme-ov-file#deprecation-note
* perf: Add benchmark that measures the rejection speed of a large non-zip file
* perf: Speed up non-zip rejection by increasing END_WINDOW_SIZE
I tested several END_WINDOW_SIZEs across 2 machines:
Machine 1: macOS 15.0.1, aarch64 (apfs /tmp)
512: test parse_large_non_zip ... bench: 30,450,608 ns/iter (+/- 673,910)
4096: test parse_large_non_zip ... bench: 7,741,366 ns/iter (+/- 521,101)
8192: test parse_large_non_zip ... bench: 5,807,443 ns/iter (+/- 546,227)
16384: test parse_large_non_zip ... bench: 4,794,314 ns/iter (+/- 419,114)
32768: test parse_large_non_zip ... bench: 4,262,897 ns/iter (+/- 397,582)
65536: test parse_large_non_zip ... bench: 4,060,847 ns/iter (+/- 280,964)
Machine 2: Debian testing, x86_64 (tmpfs /tmp)
512: test parse_large_non_zip ... bench: 65,132,581 ns/iter (+/- 7,429,976)
4096: test parse_large_non_zip ... bench: 14,109,503 ns/iter (+/- 2,892,086)
8192: test parse_large_non_zip ... bench: 9,942,500 ns/iter (+/- 1,886,063)
16384: test parse_large_non_zip ... bench: 8,205,851 ns/iter (+/- 2,902,041)
32768: test parse_large_non_zip ... bench: 7,012,011 ns/iter (+/- 2,222,879)
65536: test parse_large_non_zip ... bench: 6,577,275 ns/iter (+/- 881,546)
In both cases END_WINDOW_SIZE=8192 performed about 6x better than 512 and >8192
didn't make much of a difference on top of that.
* perf: Speed up non-zip rejection by limiting search for EOCDR.
I benchmarked several search sizes across 2 machines
(these benches are using an 8192 END_WINDOW_SIZE):
Machine 1: macOS 15.0.1, aarch64 (apfs /tmp)
whole file: test parse_large_non_zip ... bench: 5,773,801 ns/iter (+/- 411,277)
last 128k: test parse_large_non_zip ... bench: 54,402 ns/iter (+/- 4,126)
last 66,000: test parse_large_non_zip ... bench: 36,152 ns/iter (+/- 4,293)
Machine 2: Debian testing, x86_64 (tmpfs /tmp)
whole file: test parse_large_non_zip ... bench: 9,942,306 ns/iter (+/- 1,963,522)
last 128k: test parse_large_non_zip ... bench: 73,604 ns/iter (+/- 16,662)
last 66,000: test parse_large_non_zip ... bench: 41,349 ns/iter (+/- 16,812)
As you might expect these significantly increase the rejection speed for
large non-zip files.
66,000 was the number previously used by zip-rs. It was changed to zero in
7a55945743.
128K is what Info-Zip uses[1]. This seems like a reasonable (non-zero)
choice for compatibility reasons.
[1] Info-zip is extremely old and doesn't not have an official git repo to
link to. However, an unofficial fork can be found here:
bb0c4755d4/zipfile.c (L4073)
---------
Co-authored-by: Chris Hennick <4961925+Pr0methean@users.noreply.github.com>
* test(fuzz): Migrate to afl++ for fuzzing
* build: Exclude new fuzz binaries
* chore: Fix new warning
* ci: Use cargo action for format check
* deps: Update constant_time_eq and flate2
* ci: Bug fix for file paths
* ci: Bug fix: working directory is parent of repository root
* ci: Bug fix: remove stray `cd` commands
* ci: Bug fix? Make paths explicitly descend from workspace root
* ci: Bug fix? Assume github.workspace is the repo root
* test(fuzz): Commit files that were previously missing
* ci(fuzz): Bug fix for fuzz_write_with_no_features
* ci(fuzz): Bug fix: no -V arg for cmin
* ci(fuzz): Bug fix: no -a arg for cmin
* Bug fix: replace colons with dashes in filenames
* style: Fix 2 clippy warnings
* style: Fix another clippy warning in some configs
* ci(fuzz): Enable renaming in all fuzz jobs
* ci(fuzz): Fix: need to rename files in multiple dirs
Signed-off-by: Chris Hennick <4961925+Pr0methean@users.noreply.github.com>
* ci(fuzz): Install `rename` tool
* ci(fuzz): Fix redundant steps and too-late install of `rename`
* ci(fuzz): fix? replace multiple colons
---------
Signed-off-by: Chris Hennick <4961925+Pr0methean@users.noreply.github.com>
* fix: Rare combination of settings could lead to writing a corrupt archive with overlength extra data
* fix: Previous fix was breaking alignment
* style: cargo fmt --all
* fix: ZIP64 header was being written twice
* style: cargo fmt --all
* ci(fuzz): Add check that file-creation options are individually valid
* fix: Need to update extra_data_start in deep_copy_file
* style: cargo fmt --all
* test(fuzz): fix bug in Arbitrary impl
* fix: Cursor-position bugs when merging archives or opening for append
* fix: unintended feature dependency
* style: cargo fmt --all
* fix: merge_contents was miscalculating new start positions for absorbed archive's files
* fix: shallow_copy_file needs to reset CDE location since the CDE is copied
* fix: ZIP64 header was being written after AES header location was already calculated
* fix: ZIP64 header was being counted twice when writing extra-field length
* fix: deep_copy_file was positioning cursor incorrectly
* test(fuzz): Reimplement Debug so that it prints the method calls actually made
* test(fuzz): Fix issues with `Option<&mut Formatter>`
* chore: Partial debug
* chore: Revert: `merge_contents` already adjusts header_start and data_start
* chore: Revert unused `mut`
* style: cargo fmt --all
* refactor: eliminate a magic number for CDE block size
* chore: WIP: fix bugs
* refactor: Minor refactors
* refactor: eliminate a magic number for CDE block size
* refactor: Minor refactors
* refactor: Can use cde_start_pos to locate ZIP64 end locator
* chore: Fix import that can no longer be feature-gated
* chore: Fix import that can no longer be feature-gated
* refactor: Confusing variable name
* style: cargo fmt --all and fix Clippy warnings
* style: fix another Clippy warning
---------
Signed-off-by: Chris Hennick <4961925+Pr0methean@users.noreply.github.com>
* refactor: eliminate a magic number for CDE block size
* refactor: Minor refactors
* refactor: Can use cde_start_pos to locate ZIP64 end locator
* chore: Fix import that can no longer be feature-gated
* chore: Fix import that can no longer be feature-gated
Commit messages in PR no longer need to follow ConCom, since we now squash-merge PRs.
Signed-off-by: Chris Hennick <4961925+Pr0methean@users.noreply.github.com>