Files
c3c/lib/std/io/file_mmap.c3
Manu Linares eae7d0c4a1 stdlib: std::compression::zip and std::compression::deflate (#2930)
* stdlib: implement `std::compression::zip` and `std::compression::deflate`

- C3 implementation of DEFLATE (RFC 1951) and ZIP archive handling.
- Support for reading and writing archives using STORE and DEFLATE
methods.
- Decompression supports both fixed and dynamic Huffman blocks.
- Compression using greedy LZ77 matching.
- Zero dependencies on libc.
- Stream-based entry reading and writing.
- Full unit test coverage.

NOTE: This is an initial implementation. Future improvements could be:

- Optimization of the LZ77 matching (lazy matching).
- Support for dynamic Huffman blocks in compression.
- ZIP64 support for large files/archives.
- Support for encryption and additional compression methods.

* optimizations+refactoring

deflate:
- replace linear search with hash-based match finding.
- implement support for dynamic Huffman blocks using the Package-Merge
algorithm.
- add streaming decompression.
- add buffered StreamBitReader.

zip:
- add ZIP64 support.
- add CP437 and UTF-8 filename encoding detection.
- add DOS date/time conversion and timestamp preservation.
- add ZipEntryReader for streaming entry reads.
- implement ZipArchive.extract and ZipArchive.recover helpers.

other:
- Add `set_modified_time` to std::io;
- Add benchmarks and a few more unit tests.

* zip: add archive comment support

add tests

* forgot to rename the benchmark :(

* detect utf8 names on weird zips

fix method not passed to open_writer

* another edge case where directory doesn't end with /

* testing utilities

- detect encrypted zip
- `ZipArchive.open_writer` default to DEFLATE

* fix zip64 creation, add tests

* fix ZIP header endianness for big-endian compatibility

Update ZipLFH, ZipCDH, ZipEOCD, Zip64EOCD, and Zip64Locator structs to
use little-endian bitstruct types from std::core::bitorder

* fix ZipEntryReader position tracking and seek logic ZIP_METHOD_STORE

added a test to track this

* add package-merge algorithm attribution

Thanks @konimarti

* standalone deflate_benchmark.c3 against `miniz`

* fix integer overflows, leaks and improve safety

* a few safety for 32-bit systems and tests

* deflate compress optimization

* improve match finding, hash updates, and buffer usage

* use ulong for zip offsets

* style changes (#18)

* style changes

* update tests

* style changes in `deflate.c3`

* fix typo

* Allocator first. Some changes to deflate to use `copy_to`

* Fix missing conversion on 32 bits.

* Fix deflate stream. Formatting. Prefer switch over if-elseif

* - Stream functions now use long/ulong rather than isz/usz for seek/available.
- `instream.seek` is replaced by `set_cursor` and `cursor`.
- `instream.available`, `cursor` etc are long/ulong rather than isz/usz to be correct on 32-bit.

* Update to constdef

* Fix test

---------

Co-authored-by: Book-reader <thevoid@outlook.co.nz>
Co-authored-by: Christoffer Lerno <christoffer@aegik.com>
2026-02-20 20:41:34 +01:00

92 lines
3.1 KiB
Plaintext

module std::io::file::mmap @if(env::LIBC &&& env::POSIX);
import std::core::mem::vm, std::io::file;
struct FileMmap
{
File file;
VirtualMemory vm;
usz offset;
usz len;
}
<*
Provides a slice of bytes to the expected mapped range discarding the extra bytes due to misaligment of offset and/or size.
@return "Slice of the mapped range where the first byte matches the file's byte at the offset specified to File::file_mmap()"
*>
fn char[] FileMmap.bytes(&self)
{
return self.vm.ptr[self.offset:self.len];
}
<*
Destroys the underlying VirtualMemory object ie. calls munmap()"
*>
fn void? FileMmap.destroy(&self) @maydiscard
{
fault err1 = @catch(self.file.close());
fault err2 = @catch(self.vm.destroy());
if (err1) return err1~;
if (err2) return err2~;
}
module std::io::file @if(env::LIBC &&& env::POSIX);
<*
Maps a region of an already-opened file into memory
@param file : "Already opened file created on the caller scope"
@param offset : "Byte offset in file, will be rounded down to page size"
@param len : "Size in bytes to map starting from offset, will be rounded up to page size"
@return? mem::OUT_OF_MEMORY, vm::ACCESS_DENIED, vm::RANGE_OVERFLOW, vm::INVALID_ARGS, vm::UNKNOWN_ERROR, io::NO_PERMISSION, io::FILE_NOT_VALID, io::WOULD_BLOCK, io::FILE_NOT_FOUND
@return "Memory mapped region. Must be released with FileMmap.destroy(). Provided File will not be closed"
*>
fn FileMmap? mmap_file(File file, usz offset = 0, usz len = 0, VirtualMemoryAccess access = READ, bool shared = false)
{
if (len == 0)
{
ulong new_len = file.size()! - offset;
if (new_len > (ulong)isz.max) return mem::OUT_OF_MEMORY~;
len = (usz)new_len;
}
// get the page size
usz page_size = vm::aligned_alloc_size(0);
// align the offset specified by the user (might be not aligned)
usz page_offset = offset & (page_size - 1);
usz map_offset = offset - page_offset;
// adjust map length (both the region start and the region end might be not aligned)
usz map_len = len + page_offset; // when region start not aligned
map_len = vm::aligned_alloc_size(map_len); // when region end not aligned
void* ptr = vm::mmap_file(file.fd(), map_len, map_offset, access, shared)!;
// FileMmap does not own the supplied file
return {{}, {ptr, map_len, access}, page_offset, len};
}
<*
Maps a region of the given file into memory
@param filename : "File path"
@param mode : "File opening mode"
@param offset : "Byte offset in file, will be rounded down to page size"
@param len : "Size in bytes to map starting from offset, will be rounded up to page size"
@return? mem::OUT_OF_MEMORY, vm::ACCESS_DENIED, vm::RANGE_OVERFLOW, vm::INVALID_ARGS, vm::UNKNOWN_ERROR, io::NO_PERMISSION, io::FILE_NOT_VALID, io::WOULD_BLOCK, io::FILE_NOT_FOUND
@return "Memory mapped region. Must be released with FileMmap.destroy()"
*>
fn FileMmap? mmap_open(String filename, String mode, usz offset = 0, usz len = 0, VirtualMemoryAccess access = READ, bool shared = false)
{
File file = open(filename, mode)!;
defer catch (void)file.close();
FileMmap mm = mmap_file(file, offset, len, access, shared)!;
// FileMmap owns the file and it will close it on destroy()
mm.file = file;
return mm;
}