Zip-Ada is a free, open-source, independent programming library for dealing with the Zip compressed archive file format. It includes LZMA & BZip2 independent compressor & decompressor pairs (can be used outside of the Zip archive context).
Added compression for the BZip2 format for .bz2 and .zip files or streams.
Note that Zip-Ada .zip archive creation with the “Preselection_2” mode now tops (or rather, bottoms in terms of compressed size) 7-Zip for both Calgary (*) and Canterbury compression benchmarks, that for the .zip format and even the .7z format.
Enjoy!
(*) File names need extensions: .txt, .c, .lsp, .pas
And just as a friendly suggestion… You may want to post these results and release in a more general programming forum It is not always that a major tool gets beaten at something by an Ada program ^^ It could bring outside eyes and attention to your project and to the wider Ada community
I fully agree… I kinda felt a bit bad recommending it. The subreddit is heavily based on trends and something going a bit “viral”. But if it works… It is free marketing Though I fully understand not touching that forum. I do not even have a reddit account for these reasons…
I’ve got an m2 air And I can confirm that zipada Works fine When built with alire With the native aarch64 macOS Toolchain (I’m using it to extract Alire in getada!)
Cool, thanks for the test!
Just saw the instructions on CONTRIBUTING.md .
Perhaps I can also give a try this weekend.
So far I’ve dropped a line on the related reddit topic.
Oh, forgot to mention: the new Zip-Ada BZip2 encoder does often a better job than the BZip2 encoders you find by 7-Zip and Info-Zip (the latter is possibly using the original BZip2).
Here the archive size for the 5 e-books of the Calgary and Canterbury corpora, all compressed with the “best compression” option:
On my side, not sure if I’ll do it soon.
Technically, it is not too difficult: customizing zipada.adb (keep the loop for getting the “*.pdf”, remove the options, etc.).
But there are some unclear parts in that benchmark.
One is the tradeoff speed/strength.
Obviously the C++ version uses a strong and slow compression for instance. So, what is really meant by “default settings” ?
Another open point is the compression format: Zip knows multiple formats but some unzippers know only Deflate.
I have asked that on Reddit:
“Does your benchmark require the Deflate compression format or is it OK with other formats (BZip2, LZMA, ZSTD, …) supported by the Zip archive format?”