Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm a little confused about how the object base is looked up in these systems, and if they're sparse or dense and have any size or total object count limitations, and if that ends up having the same limitations on total count as page tables that required the current multi-level approach.

As surely you could consider page table as effectively implementing a fixed-size "object cache"? It is just a lookup for an offset into physical memory, after all, with the "object ID" just being the masked first part of the address? And if the objects are variable sized, is it possible to end up with physical address fragmentation as objects of different sizes are allocated and freed?

The claim of single-cycle lookups today would require an on-chip fixed-size (and small!) fast sram, as there's a pretty hard limit on the amount of memory you can get to read in a single clock cycle, no matter how fancy or simple the logic behind deciding to lookup. If we call this area the "TLB" haven't we got back to pagetables again?

And for the size of sram holding the TLB/object cache entries - increasing the amount of data stored in them means you have less total too. A current x86_64 CPU supports 2^48 of physical address space, reduced to 36 bits if you know it's 4k aligned - and 2^57 of virtual address space as the tag, again reduced to 45 bits if we know it's 4k aligned. That means to store the tag and physical address you need a total of 81 bits of SRRAM. A 64-bit object ID, plus 64-bit physical address plus 64-bit size is 192bits, over 2x that, so you could pack 2x the number of TLB entries into the same sram block. To match the capabilities of the example above, 57 bits of physical address (cannot be reduced as arbitrary sizes means it's not aligned), plus a similarly reduced to 48 bit object ID and size still adds up to 153, only slightly less than 2x, though I'm sure people could argue that reducing the capabilities here have merit, I don't know how many objects or their maximum possible size in such a system. And that's "worst case" 4k pages for the pagetable system too.

I can't see how this idea could be implemented without extreme limitations - look at the TLB size of modern processors and that's the maximum number of objects you could have while meeting the claims of speed and simplicity. There may be some advantage in making them flexible in terms of size, rather than fixed-size, but then you run into the same fragmentation issues, and need to keep that size somewhere in the extremely-tight TLB memory.



So I commented on this a bit elsewhere, but the whole object business is irrelevant for how the address translation hardware in this machine actually works. While the subfields of the address are exploited to optimize hash function used, the hardware is otherwise agnostic to what the upper bits of the address mean. The TLB is just huge relative to the amount of memory it had such that there's one entry for each physical page in the system and it deals with collisions in the TLB by evicting pages to disk


> As surely you could consider page table as effectively implementing a fixed-size "object cache"? It is just a lookup for an offset into physical memory, after all, with the "object ID" just being the masked first part of the address? And if the objects are variable sized, is it possible to end up with physical address fragmentation as objects of different sizes are allocated and freed?

Because that's only a base, not a limit. The right pointer arithmetic can spill over to any other object base's memory.


> with the "object ID" just being the masked first part of the address?

Doesn't that imply the minimum-sized object requires 4K physical ram?

Is that a problem?


Maybe? If you just round up each "object" to 4k then you can implement this using the current PTE on x86_64, but this removes the (supposed) advantage of only requiring a single PTE for each object (or "object cache" lookup entry or whatever you want to call it) in the cases when an object spans multiple page-sizes of data.

Having arbitrary sizes objects will likely be possible in hardware - it's just an extra size being stored in the PTE if you can mask out the objectID from the address (in the example in the original post, it's a whole 64-bit object ID, allowing a full 64-bits of offset within each object, but totaling a HUGE 128bit effectively address)

But arbitrary sizes feels like it pushes the issues that many current userspace allocators have to deal with today to the hardware/microcode - namely about packing to cope with fragmentation and similar (only instead of virtual address space they'll have to deal with physical address space). The solutions to this today are certainly non-trivial and still can fail in many ways, so far away from being solved, let along solved in a simple enough way to be implemented that close to hardware.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: