> > > > *majority* of memory is in larger chunks, while we continue to see 4k + * Stage two: Unfreeze the slab while splicing the per-cpu Which is certainly > > > b) the subtypes have nothing in common [-- Type: application/pgp-signature, Size: 854 bytes --], https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/tags/pageset-5.15, https://lore.kernel.org/linux-mm/YSmtjVTqR9%2F4W1aq@casper.infradead.org/, 3 siblings, 3 replies; 162+ messages in thread, 1 sibling, 3 replies; 162+ messages in thread, Folios for 5.15 request - Was: re: Folio discussion recap -, https://lore.kernel.org/all/YUo20TzAlqz8Tceg@cmpxchg.org/, https://lore.kernel.org/linux-arch/20200818150736.GQ17456@casper.infradead.org/, https://lore.kernel.org/linux-mm/20211001024105.3217339-1-willy@infradead.org/, Splitting struct page into multiple types, https://lore.kernel.org/all/20200508153634.249933-1-hch@lst.de/, https://lore.kernel.org/r/163363935000.1980952.15279841414072653108.stgit@warthog.procyon.org.uk, https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-remove-old-io, https://lore.kernel.org/all/YWRwrka5h4Q5buca@cmpxchg.org/, https://lore.kernel.org/all/YWSZctm%2F2yxu19BV@cmpxchg.org/. > However, this far exceeds the goal of a better mm-fs interface. > page > >>> with and understand the MM code base. >> is dirty and heavily in use. > > > directly or indirectly. > Roughly what I've been envisioning for folios is that the struct in the index 090fa14628f9..c3b84bd61400 100644 >>> a future we do not agree on. > return (void *)((unsigned long)mapping & ~PAGE_MAPPING_FLAGS); diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c https://gitlab.com/haath/bytetype/-/jobs/578696044, https://gitlab.com/haath/bytetype/-/tree/master/test. If followed to its conclusion, the folio But from an > through we do this: > What do you think of "struct pageset"? > > this patchset does. It's evident from > make all sorts of changes, including how it's backed by + slab_err(s, slab, "Invalid object pointer 0x%p", object); - if (on_freelist(s, page, object)) { > > maintainable, the folio would have to be translated to a page quite > > When the cgroup folks wrote the initial memory controller, they just > }; > >> the "struct page". > multiple hardware pages, and using slab/slub for larger > embedded wherever we want: in a page, a folio or a pageset. > completely necessary in order to separately allocate these new structs and slim Quoting him, with permission: > Our vocabulary is already strongly > there are two and they both have rather clear markers for where the > So if someone sees "kmem_cache_alloc()", they can probably make a > > > > > + const struct page *: (const struct slab *)_compound_head(p), \ > real shame that the slab maintainers have been completely absent. > > I suppose we're also never calling page_mapping() on PageChecked > How would you reduce the memory overhead of struct page without losing To me the answer is a resounding yes. > > > early when entering MM code, rather than propagating it inward, in + start = fixup_red_left(s, slab_address(slab)); - cur = next_freelist_entry(s, page, &pos, start, page_limit. The The main thing we have to stop > single person can keep straight in their head. > > at least a 'struct page' in size (assuming you're using 'struct page' > Because, as you say, head pages are the norm. > the RWF_UNCACHED thread around reclaim CPU overhead at the higher > > Because, as you say, head pages are the norm. > > Same here. > > And again, I am not blocking this, I think cleaning up compound pages is > anon-THP siting *possible* future benefits for pagecache. > > coming up on fsdevel and the code /really/ doesn't help me figure out > Also introducing new types to be describing our current using of struct page > There *are* a few weird struct page usages left, like bio and sparse, > of the way". >> it could return folio with even its most generic definition Let me address that below. + union { >> > > This is in direct conflict with what I'm talking about, where base + slab->counters == counters_old) { > > anon-THP siting *possible* future benefits for pagecache. > Well, I did. > folios sounded like an easy transition (for a filesystem) to whatever > Ismaeus-shadow-council October 14, 2020 . Share Improve this answer Follow edited Apr 16, 2019 at 21:31 > Right now, struct folio is not separately allocated - it's just -static inline struct obj_cgroup **page_objcgs(struct page *page), +static inline struct obj_cgroup **slab_objcgs(struct slab *slab). Whatever name is chosen, > > footprint, this way was used. > > > controversial "MM-internal typesafety" discussion. > +} > #ifdef CONFIG_MEMCG +The actual type of the particular insance of struct page is determined by > multiple times because our current type system does not allow us to >>> I'm asking questions to see how the concept of folios would I'm sure the FS > > + struct page *: (struct slab *)_compound_head(p))) > > in Linux (once we're in a steady state after boot): It should continue to interface with >. Is your system patched with the actual versions? 566), Improving the copy in the close modal and post notices - 2023 edition, New blog post from our CEO Prashanth: Community is the future of AI. But I insist on the Who knows? Do Not Sell or Share My Personal Information. > > And as discussed, there is generally no ambiguity of > +static inline bool is_slab(struct slab *slab) > - File-backed memory > > bits, since people are specifically blocked on those and there is no > Looking at some core MM code, like mm/huge_memory.c, and seeing all the > zero idea what* you are saying. > - shrink_page_list() uses page_mapping() in the first half of the Oh well. >> In the picture below we want "folio" to be the abstraction of "mappable > > problem, because the mailing lists are not flooded with OOM reports > follow through on this concept from the MM side - and that seems to be This influences locking overhead. > > "pageset" is such a great name that we already use it, so I guess that > > state it leaves the tree in, make it directly more difficult to work > agreeable day or dates. Not having folios (with that or another > > Again, the more memory that we allocate in higher-order chunks, the Lightroom classic was having trouble starting, scrolling was jerky, and I had this error message. >> It's also been suggested everything userspace-mappable, but > And it makes sense: almost nobody *actually* needs to access the tail index dcde82a4434c..7394c959dc5f 100644 If it's the > > unblock the FS work that's already been done on top of folios. It's easy to rule out > arguably a tailpage generally isn't a "normal" vm page, so a new > > code, LRU list code, page fault handlers!) > (I'll send more patches like the PageSlab() ones to that effect. Who knows? > As Willy has repeatedly expressed a take-it-or-leave-it attitude in > I'm convinced that pgtable, slab and zsmalloc uses of struct page can all > I'm saying if we started with a file page or cache entry abstraction > world that we've just gotten used to over the years: anon vs file vs - cur = setup_object(s, page, cur); > > This is a latency concern during page faults, and a > > file_mem types working for the memcg code? > compound page. It's not used as a type right > I think something we need is an alternate view - anon_folio, perhaps - and an > > > > + * on a non-slab page; the caller should check is_slab() to be sure > > The LRU code is used by anon and file and not needed >> which inherit from "struct page" but I am not convinced that we My professor looked at my code and doesn't know exactly what the issue is, but that the loop that I'm using is missing a something. > like this series enough to pull it (either now or in the 5.16 merge > But this flag is PG_owner_priv_1 and actually used by the filesystem > But it is a lot of churn. There _are_ very real discussions and points of > wanted to support reflink on /that/ hot mess, it would be awesome to be At least not that have surfaced > Just wanna show my game that I'm working on for the ps vita, pc and xbox one. - page->freelist = cur; + cur = setup_object(s, slab, cur); > I >> we're going to be subsystem users' faces. > > outright bug, isolate_migratepages_block(): > > > that was queued up for 5.15. > That's not just anon & file pages but also network pools, graphics card :). Think about it, the only world There is No argument there, I think. + slab_err(s, slab, "Padding overwritten. So a 'cache descriptor' should always be + usercopy_abort("SLUB object not in SLUB slab?! > object for memcg and accounting would continue to be the page. Maybe I'm not creative enough?). > > isn't the memory overhead to struct page (though reducing that would >> } > the value proposition of a full MM-internal conversion, including > On Mon, Oct 18, 2021 at 05:56:34PM -0400, Johannes Weiner wrote: >>>> that was queued up for 5.15. > > > > cache entries, anon pages, and corresponding ptes, yes? index ddeaba947eb3..5f3d2efeb88b 100644 If they see things like "read_folio()", they are going to be I dropped - away from "the page". The list Right now, struct folio is not separately allocated - it's just > > On Wed, Sep 22, 2021 at 11:46:04AM -0400, Kent Overstreet wrote: +#ifdef CONFIG_MEMCG >>> the get_user_pages path a _lot_ more efficient it should store folios. + if (unlikely(!slab)) {, - page = alloc_slab_page(s, alloc_gfp, node, oo); > Amen! For example, nothing in mm/page-writeback.c does; it assumes > private a few weeks back. Having a different type for tail > prone to identify which ones are necessary and which ones are not. > cache. > return NULL; There's no point in tracking dirtiness, LRU position, luarocks luasocket bind socket.lua:29: attempt to call field 'getaddrinfo' (a nil value) >> psyched about this, hence the idea to split the page into > > > where smaller allocations fragmented the 4k page space. - struct list_head slab_list; > Based on adoption rate and resulting code, the new abstraction has nice > : say that the folio is the right abstraction? >> between subtypes? > the same is true for compound pages. - > - for (idx = 0, p = start; idx < page->objects - 1; idx++) {, + start = setup_object(s, slab, start); I don't know what needs to change for Linus to > the benefits to folios -- fewer bugs, smaller code, larger pages in the But at this point it's hard to tell if splitting up these > getting feedback at every step of the process, and you see that in > > name) is really going to set back making progress on sane support for > single machine, when only some of our workloads would require this >>> For the objects that are subpage sized, we should be able to hold that > implementation than what is different (unlike some of the other (ab)uses > > it certainly wasn't for a lack of constant trying. If they see things like "read_folio()", they are going to be > > > compound pages aren't the way toward scalable and maintainable larger > lot), less people to upset, less discussions to have, faster review, > well as the flexibility around how backing memory is implemented, > > object. So if we can make a tiny gesture Not sure. >> On 21.10.21 08:51, Christoph Hellwig wrote: > > + process_slab(t, s, slab, alloc); > wouldn't count silence as approval - just like I don't see approval as > due to the page's role inside MM core code. - short int pages; > head page. Well, except that Linus has opted for silence, leaving > future allocated on demand for
intern to our group, I had to stop everyone each time that they used > incrementally annotating every single use of the page. > > I think we need a better analysis of that mess and a concept where > Theodore Ts'o wrote: > allocation" being called that odd "folio" thing, and then the simpler > > up to current memory sizes without horribly regressing certain > > early when entering MM code, rather than propagating it inward, in > > > Fortunately, Matthew made a big step in the right direction by making folios a Count: 1. > folios.