> the point is to save people / well-being, not reduce it
Oh, you haven't met _that_ part of the climate people. A surprising number of them do want to reduce the number of people and they see "degrowth" as the solution.
I can see how it appears that way but ultimately that's nobody's goal of course. Might be worth actually talking to someone who you feel is in that group and realizing that they have the same morals and end goals as you and me, just seeing a different path to get there
Many would actually say we should reduce the well-being, if you want to take it literally, but specifically of the richest 10% of people or so, such that everyone can be at an equal lifestyle that earth can sustain, since it's not fair if 90% needs to live far under that common standard so that the rich can be rich. That could be something to agree or disagree with (most of us here are in that top 10%; I certainly am), but I expect you'd not find 99% of "them" having an unreasonable stance when you hear them out
When you use a FOSS product more, the person that wrote the code doesn't end up spending more money. When you use a free service more, someone is paying for that usage and resources.
My Hacker News items table in ClickHouse has 47,428,860 items, and it's 5.82 GB compressed and 18.18 GB uncompressed. What makes Parquet compression worse here, when both formats are columnar?
Sorting, compression algorithm +level, and data types can all have an impact. I noted elsewhere that a Boolean is getting represented as an integer. That’s one bit vs 1-4 bytes.
There is also flexibility in what you define as the dataset. Skinnier, but more focused tables could be space saving vs a wide table that covers everything -will probably break compressible runs of data.
Plus isn't the least wasteful format, native duckdb for instance compacts better. That's not just down to the compression algorithm, which as you say got three main options for parquet.
You could download the data and run that analysis yourself. I’d be interested to see it, especially your method of identifying “political shit-slop” and “AI” and the relationship to COVID. Sounds like an interesting project.
reply