Files
c3c/lib/std/io/stream/bytewriter.c3
Manu Linares eae7d0c4a1 stdlib: std::compression::zip and std::compression::deflate (#2930)
* stdlib: implement `std::compression::zip` and `std::compression::deflate`

- C3 implementation of DEFLATE (RFC 1951) and ZIP archive handling.
- Support for reading and writing archives using STORE and DEFLATE
methods.
- Decompression supports both fixed and dynamic Huffman blocks.
- Compression using greedy LZ77 matching.
- Zero dependencies on libc.
- Stream-based entry reading and writing.
- Full unit test coverage.

NOTE: This is an initial implementation. Future improvements could be:

- Optimization of the LZ77 matching (lazy matching).
- Support for dynamic Huffman blocks in compression.
- ZIP64 support for large files/archives.
- Support for encryption and additional compression methods.

* optimizations+refactoring

deflate:
- replace linear search with hash-based match finding.
- implement support for dynamic Huffman blocks using the Package-Merge
algorithm.
- add streaming decompression.
- add buffered StreamBitReader.

zip:
- add ZIP64 support.
- add CP437 and UTF-8 filename encoding detection.
- add DOS date/time conversion and timestamp preservation.
- add ZipEntryReader for streaming entry reads.
- implement ZipArchive.extract and ZipArchive.recover helpers.

other:
- Add `set_modified_time` to std::io;
- Add benchmarks and a few more unit tests.

* zip: add archive comment support

add tests

* forgot to rename the benchmark :(

* detect utf8 names on weird zips

fix method not passed to open_writer

* another edge case where directory doesn't end with /

* testing utilities

- detect encrypted zip
- `ZipArchive.open_writer` default to DEFLATE

* fix zip64 creation, add tests

* fix ZIP header endianness for big-endian compatibility

Update ZipLFH, ZipCDH, ZipEOCD, Zip64EOCD, and Zip64Locator structs to
use little-endian bitstruct types from std::core::bitorder

* fix ZipEntryReader position tracking and seek logic ZIP_METHOD_STORE

added a test to track this

* add package-merge algorithm attribution

Thanks @konimarti

* standalone deflate_benchmark.c3 against `miniz`

* fix integer overflows, leaks and improve safety

* a few safety for 32-bit systems and tests

* deflate compress optimization

* improve match finding, hash updates, and buffer usage

* use ulong for zip offsets

* style changes (#18)

* style changes

* update tests

* style changes in `deflate.c3`

* fix typo

* Allocator first. Some changes to deflate to use `copy_to`

* Fix missing conversion on 32 bits.

* Fix deflate stream. Formatting. Prefer switch over if-elseif

* - Stream functions now use long/ulong rather than isz/usz for seek/available.
- `instream.seek` is replaced by `set_cursor` and `cursor`.
- `instream.available`, `cursor` etc are long/ulong rather than isz/usz to be correct on 32-bit.

* Update to constdef

* Fix test

---------

Co-authored-by: Book-reader <thevoid@outlook.co.nz>
Co-authored-by: Christoffer Lerno <christoffer@aegik.com>
2026-02-20 20:41:34 +01:00

120 lines
2.7 KiB
Plaintext

module std::io;
import std::math;
struct ByteWriter (OutStream)
{
char[] bytes;
usz index;
Allocator allocator;
}
<*
@param [&inout] self
@param [&inout] allocator
@require self.bytes.len == 0 : "Init may not run on already initialized data"
@ensure (bool)allocator, self.index == 0
*>
fn ByteWriter* ByteWriter.init(&self, Allocator allocator)
{
*self = { .bytes = {}, .allocator = allocator };
return self;
}
<*
@param [&inout] self
@require self.bytes.len == 0 : "Init may not run on already initialized data"
@ensure self.index == 0
*>
fn ByteWriter* ByteWriter.tinit(&self)
{
return self.init(tmem) @inline;
}
fn ByteWriter* ByteWriter.init_with_buffer(&self, char[] data)
{
*self = { .bytes = data, .allocator = null };
return self;
}
fn void? ByteWriter.destroy(&self) @dynamic
{
if (!self.allocator) return;
if (void* ptr = self.bytes.ptr) allocator::free(self.allocator, ptr);
*self = { };
}
fn char[] ByteWriter.array_view(self) @inline
{
return self.bytes[:self.index];
}
fn String ByteWriter.str_view(&self) @inline
{
return (String)self.bytes[:self.index];
}
fn void? ByteWriter.ensure_capacity(&self, usz len) @inline
{
if (self.bytes.len > len) return;
if (!self.allocator) return OUT_OF_SPACE~;
if (len < 16) len = 16;
usz new_capacity = math::next_power_of_2(len);
char* new_ptr = allocator::realloc_try(self.allocator, self.bytes.ptr, new_capacity)!;
self.bytes = new_ptr[:new_capacity];
}
fn usz? ByteWriter.write(&self, char[] bytes) @dynamic
{
self.ensure_capacity(self.index + bytes.len)!;
mem::copy(&self.bytes[self.index], bytes.ptr, bytes.len);
self.index += bytes.len;
return bytes.len;
}
fn void? ByteWriter.write_byte(&self, char c) @dynamic
{
self.ensure_capacity(self.index + 1)!;
self.bytes[self.index++] = c;
}
<*
@param [&inout] self
@param reader
*>
fn usz? ByteWriter.read_from(&self, InStream reader) @dynamic
{
usz start_index = self.index;
if (&reader.available)
{
while (ulong available = reader.available()!)
{
if (available > usz.max) return OUT_OF_SPACE~;
self.ensure_capacity(self.index + (usz)available)!;
usz read = reader.read(self.bytes[self.index..])!;
self.index += read;
}
return self.index - start_index;
}
if (self.bytes.len == 0)
{
self.ensure_capacity(256)!;
}
while (true)
{
// See how much we can read.
usz len_to_read = self.bytes.len - self.index;
// Less than 16 bytes? Double the capacity
if (len_to_read < 16)
{
self.ensure_capacity(self.bytes.len * 2)!;
len_to_read = self.bytes.len - self.index;
}
// Read into the rest of the buffer
usz read = reader.read(self.bytes[self.index..])!;
self.index += read;
// Ok, we reached the end.
if (read < len_to_read) return self.index - start_index;
// Otherwise go another round
}
}