aboutsummaryrefslogtreecommitdiff
path: root/Documentation/filesystems
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/filesystems')
-rw-r--r--Documentation/filesystems/caching/netfs-api.rst7
-rw-r--r--Documentation/filesystems/cifs/ksmbd.rst4
-rw-r--r--Documentation/filesystems/dax.rst6
-rw-r--r--Documentation/filesystems/erofs.rst2
-rw-r--r--Documentation/filesystems/ext4/blocks.rst2
-rw-r--r--Documentation/filesystems/fscrypt.rst25
-rw-r--r--Documentation/filesystems/fsverity.rst6
-rw-r--r--Documentation/filesystems/locking.rst54
-rw-r--r--Documentation/filesystems/netfs_library.rst140
-rw-r--r--Documentation/filesystems/porting.rst6
-rw-r--r--Documentation/filesystems/vfs.rst73
11 files changed, 192 insertions, 133 deletions
diff --git a/Documentation/filesystems/caching/netfs-api.rst b/Documentation/filesystems/caching/netfs-api.rst
index f84e9ffdf0b4..5066113acad5 100644
--- a/Documentation/filesystems/caching/netfs-api.rst
+++ b/Documentation/filesystems/caching/netfs-api.rst
@@ -345,8 +345,9 @@ The following facilities are provided to manage this:
To support this, the following functions are provided::
- int fscache_set_page_dirty(struct page *page,
- struct fscache_cookie *cookie);
+ bool fscache_dirty_folio(struct address_space *mapping,
+ struct folio *folio,
+ struct fscache_cookie *cookie);
void fscache_unpin_writeback(struct writeback_control *wbc,
struct fscache_cookie *cookie);
void fscache_clear_inode_writeback(struct fscache_cookie *cookie,
@@ -354,7 +355,7 @@ To support this, the following functions are provided::
const void *aux);
The *set* function is intended to be called from the filesystem's
-``set_page_dirty`` address space operation. If ``I_PINNING_FSCACHE_WB`` is not
+``dirty_folio`` address space operation. If ``I_PINNING_FSCACHE_WB`` is not
set, it sets that flag and increments the use count on the cookie (the caller
must already have called ``fscache_use_cookie()``).
diff --git a/Documentation/filesystems/cifs/ksmbd.rst b/Documentation/filesystems/cifs/ksmbd.rst
index b0d354fd8066..1af600db2e70 100644
--- a/Documentation/filesystems/cifs/ksmbd.rst
+++ b/Documentation/filesystems/cifs/ksmbd.rst
@@ -82,10 +82,10 @@ Signing Update Supported.
Pre-authentication integrity Supported.
SMB3 encryption(CCM, GCM) Supported. (CCM and GCM128 supported, GCM256 in
progress)
-SMB direct(RDMA) Partially Supported. SMB3 Multi-channel is
- required to connect to Windows client.
+SMB direct(RDMA) Supported.
SMB3 Multi-channel Partially Supported. Planned to implement
replay/retry mechanisms for future.
+Receive Side Scaling mode Supported.
SMB3.1.1 POSIX extension Supported.
ACLs Partially Supported. only DACLs available, SACLs
(auditing) is planned for the future. For
diff --git a/Documentation/filesystems/dax.rst b/Documentation/filesystems/dax.rst
index e3b30429d703..c04609d8ee24 100644
--- a/Documentation/filesystems/dax.rst
+++ b/Documentation/filesystems/dax.rst
@@ -23,11 +23,11 @@ on it as usual. The `DAX` code currently only supports files with a block
size equal to your kernel's `PAGE_SIZE`, so you may need to specify a block
size when creating the filesystem.
-Currently 4 filesystems support `DAX`: ext2, ext4, xfs and virtiofs.
+Currently 5 filesystems support `DAX`: ext2, ext4, xfs, virtiofs and erofs.
Enabling `DAX` on them is different.
-Enabling DAX on ext2
---------------------
+Enabling DAX on ext2 and erofs
+------------------------------
When mounting the filesystem, use the ``-o dax`` option on the command line or
add 'dax' to the options in ``/etc/fstab``. This works to enable `DAX` on all files
diff --git a/Documentation/filesystems/erofs.rst b/Documentation/filesystems/erofs.rst
index 7119aa213be7..bef6d3040ce4 100644
--- a/Documentation/filesystems/erofs.rst
+++ b/Documentation/filesystems/erofs.rst
@@ -40,7 +40,7 @@ Here is the main features of EROFS:
Inode metadata size 32 bytes 64 bytes
Max file size 4 GB 16 EB (also limited by max. vol size)
Max uids/gids 65536 4294967296
- File change time no yes (64 + 32-bit timestamp)
+ Per-inode timestamp no yes (64 + 32-bit timestamp)
Max hardlinks 65536 4294967296
Metadata reserved 4 bytes 14 bytes
===================== ============ =====================================
diff --git a/Documentation/filesystems/ext4/blocks.rst b/Documentation/filesystems/ext4/blocks.rst
index bd722ecd92d6..b0f80ea87c90 100644
--- a/Documentation/filesystems/ext4/blocks.rst
+++ b/Documentation/filesystems/ext4/blocks.rst
@@ -39,7 +39,7 @@ For 32-bit filesystems, limits are as follows:
- 4TiB
- 8TiB
- 16TiB
- - 256PiB
+ - 256TiB
* - Blocks Per Block Group
- 8,192
- 16,384
diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst
index 4d5d50dca65c..6ccd5efb25b7 100644
--- a/Documentation/filesystems/fscrypt.rst
+++ b/Documentation/filesystems/fscrypt.rst
@@ -1047,8 +1047,8 @@ astute users may notice some differences in behavior:
may be used to overwrite the source files but isn't guaranteed to be
effective on all filesystems and storage devices.
-- Direct I/O is not supported on encrypted files. Attempts to use
- direct I/O on such files will fall back to buffered I/O.
+- Direct I/O is supported on encrypted files only under some
+ circumstances. For details, see `Direct I/O support`_.
- The fallocate operations FALLOC_FL_COLLAPSE_RANGE and
FALLOC_FL_INSERT_RANGE are not supported on encrypted files and will
@@ -1179,6 +1179,27 @@ Inline encryption doesn't affect the ciphertext or other aspects of
the on-disk format, so users may freely switch back and forth between
using "inlinecrypt" and not using "inlinecrypt".
+Direct I/O support
+==================
+
+For direct I/O on an encrypted file to work, the following conditions
+must be met (in addition to the conditions for direct I/O on an
+unencrypted file):
+
+* The file must be using inline encryption. Usually this means that
+ the filesystem must be mounted with ``-o inlinecrypt`` and inline
+ encryption hardware must be present. However, a software fallback
+ is also available. For details, see `Inline encryption support`_.
+
+* The I/O request must be fully aligned to the filesystem block size.
+ This means that the file position the I/O is targeting, the lengths
+ of all I/O segments, and the memory addresses of all I/O buffers
+ must be multiples of this value. Note that the filesystem block
+ size may be greater than the logical block size of the block device.
+
+If either of the above conditions is not met, then direct I/O on the
+encrypted file will fall back to buffered I/O.
+
Implementation details
======================
diff --git a/Documentation/filesystems/fsverity.rst b/Documentation/filesystems/fsverity.rst
index 1d831e3cbcb3..8cc536d08f51 100644
--- a/Documentation/filesystems/fsverity.rst
+++ b/Documentation/filesystems/fsverity.rst
@@ -549,7 +549,7 @@ Pagecache
~~~~~~~~~
For filesystems using Linux's pagecache, the ``->readpage()`` and
-``->readpages()`` methods must be modified to verify pages before they
+``->readahead()`` methods must be modified to verify pages before they
are marked Uptodate. Merely hooking ``->read_iter()`` would be
insufficient, since ``->read_iter()`` is not used for memory maps.
@@ -611,7 +611,7 @@ workqueue, and then the workqueue work does the decryption or
verification. Finally, pages where no decryption or verity error
occurred are marked Uptodate, and the pages are unlocked.
-Files on ext4 and f2fs may contain holes. Normally, ``->readpages()``
+Files on ext4 and f2fs may contain holes. Normally, ``->readahead()``
simply zeroes holes and sets the corresponding pages Uptodate; no bios
are issued. To prevent this case from bypassing fs-verity, these
filesystems use fsverity_verify_page() to verify hole pages.
@@ -778,7 +778,7 @@ weren't already directly answered in other parts of this document.
- To prevent bypassing verification, pages must not be marked
Uptodate until they've been verified. Currently, each
filesystem is responsible for marking pages Uptodate via
- ``->readpages()``. Therefore, currently it's not possible for
+ ``->readahead()``. Therefore, currently it's not possible for
the VFS to do the verification on its own. Changing this would
require significant changes to the VFS and all filesystems.
diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst
index 3f9b1497ebb8..c26d854275a0 100644
--- a/Documentation/filesystems/locking.rst
+++ b/Documentation/filesystems/locking.rst
@@ -239,10 +239,8 @@ prototypes::
int (*writepage)(struct page *page, struct writeback_control *wbc);
int (*readpage)(struct file *, struct page *);
int (*writepages)(struct address_space *, struct writeback_control *);
- int (*set_page_dirty)(struct page *page);
+ bool (*dirty_folio)(struct address_space *, struct folio *folio);
void (*readahead)(struct readahead_control *);
- int (*readpages)(struct file *filp, struct address_space *mapping,
- struct list_head *pages, unsigned nr_pages);
int (*write_begin)(struct file *, struct address_space *mapping,
loff_t pos, unsigned len, unsigned flags,
struct page **pagep, void **fsdata);
@@ -250,21 +248,21 @@ prototypes::
loff_t pos, unsigned len, unsigned copied,
struct page *page, void *fsdata);
sector_t (*bmap)(struct address_space *, sector_t);
- void (*invalidatepage) (struct page *, unsigned int, unsigned int);
+ void (*invalidate_folio) (struct folio *, size_t start, size_t len);
int (*releasepage) (struct page *, int);
void (*freepage)(struct page *);
int (*direct_IO)(struct kiocb *, struct iov_iter *iter);
bool (*isolate_page) (struct page *, isolate_mode_t);
int (*migratepage)(struct address_space *, struct page *, struct page *);
void (*putback_page) (struct page *);
- int (*launder_page)(struct page *);
- int (*is_partially_uptodate)(struct page *, unsigned long, unsigned long);
+ int (*launder_folio)(struct folio *);
+ bool (*is_partially_uptodate)(struct folio *, size_t from, size_t count);
int (*error_remove_page)(struct address_space *, struct page *);
int (*swap_activate)(struct file *);
int (*swap_deactivate)(struct file *);
locking rules:
- All except set_page_dirty and freepage may block
+ All except dirty_folio and freepage may block
====================== ======================== ========= ===============
ops PageLocked(page) i_rwsem invalidate_lock
@@ -272,20 +270,19 @@ ops PageLocked(page) i_rwsem invalidate_lock
writepage: yes, unlocks (see below)
readpage: yes, unlocks shared
writepages:
-set_page_dirty no
+dirty_folio maybe
readahead: yes, unlocks shared
-readpages: no shared
write_begin: locks the page exclusive
write_end: yes, unlocks exclusive
bmap:
-invalidatepage: yes exclusive
+invalidate_folio: yes exclusive
releasepage: yes
freepage: yes
direct_IO:
isolate_page: yes
migratepage: yes (both)
putback_page: yes
-launder_page: yes
+launder_folio: yes
is_partially_uptodate: yes
error_remove_page: yes
swap_activate: no
@@ -300,9 +297,6 @@ completion.
->readahead() unlocks the pages that I/O is attempted on like ->readpage().
-->readpages() populates the pagecache with the passed pages and starts
-I/O against them. They come unlocked upon I/O completion.
-
->writepage() is used for two purposes: for "memory cleansing" and for
"sync". These are quite different operations and the behaviour may differ
depending upon the mode.
@@ -361,22 +355,22 @@ If nr_to_write is NULL, all dirty pages must be written.
writepages should _only_ write pages which are present on
mapping->io_pages.
-->set_page_dirty() is called from various places in the kernel
-when the target page is marked as needing writeback. It may be called
-under spinlock (it cannot block) and is sometimes called with the page
-not locked.
+->dirty_folio() is called from various places in the kernel when
+the target folio is marked as needing writeback. The folio cannot be
+truncated because either the caller holds the folio lock, or the caller
+has found the folio while holding the page table lock which will block
+truncation.
->bmap() is currently used by legacy ioctl() (FIBMAP) provided by some
filesystems and by the swapper. The latter will eventually go away. Please,
keep it that way and don't breed new callers.
-->invalidatepage() is called when the filesystem must attempt to drop
+->invalidate_folio() is called when the filesystem must attempt to drop
some or all of the buffers from the page when it is being truncated. It
-returns zero on success. If ->invalidatepage is zero, the kernel uses
-block_invalidatepage() instead. The filesystem must exclusively acquire
-invalidate_lock before invalidating page cache in truncate / hole punch path
-(and thus calling into ->invalidatepage) to block races between page cache
-invalidation and page cache filling functions (fault, read, ...).
+returns zero on success. The filesystem must exclusively acquire
+invalidate_lock before invalidating page cache in truncate / hole punch
+path (and thus calling into ->invalidate_folio) to block races between page
+cache invalidation and page cache filling functions (fault, read, ...).
->releasepage() is called when the kernel is about to try to drop the
buffers from the page in preparation for freeing it. It returns zero to
@@ -386,9 +380,9 @@ the kernel assumes that the fs has no private interest in the buffers.
->freepage() is called when the kernel is done dropping the page
from the page cache.
-->launder_page() may be called prior to releasing a page if
-it is still found to be dirty. It returns zero if the page was successfully
-cleaned, or an error value if not. Note that in order to prevent the page
+->launder_folio() may be called prior to releasing a folio if
+it is still found to be dirty. It returns zero if the folio was successfully
+cleaned, or an error value if not. Note that in order to prevent the folio
getting mapped back in and redirtied, it needs to be kept locked
across the entire operation.
@@ -438,13 +432,13 @@ prototypes::
locking rules:
====================== ============= ================= =========
-ops inode->i_lock blocked_lock_lock may block
+ops flc_lock blocked_lock_lock may block
====================== ============= ================= =========
-lm_notify: yes yes no
+lm_notify: no yes no
lm_grant: no no no
lm_break: yes no no
lm_change yes no no
-lm_breaker_owns_lease: no no no
+lm_breaker_owns_lease: yes no no
====================== ============= ================= =========
buffer_head
diff --git a/Documentation/filesystems/netfs_library.rst b/Documentation/filesystems/netfs_library.rst
index 4f373a8ec47b..69f00179fdfe 100644
--- a/Documentation/filesystems/netfs_library.rst
+++ b/Documentation/filesystems/netfs_library.rst
@@ -7,6 +7,8 @@ Network Filesystem Helper Library
.. Contents:
- Overview.
+ - Per-inode context.
+ - Inode context helper functions.
- Buffered read helpers.
- Read helper functions.
- Read helper structures.
@@ -28,6 +30,69 @@ Note that the library module doesn't link against local caching directly, so
access must be provided by the netfs.
+Per-Inode Context
+=================
+
+The network filesystem helper library needs a place to store a bit of state for
+its use on each netfs inode it is helping to manage. To this end, a context
+structure is defined::
+
+ struct netfs_i_context {
+ const struct netfs_request_ops *ops;
+ struct fscache_cookie *cache;
+ };
+
+A network filesystem that wants to use netfs lib must place one of these
+directly after the VFS ``struct inode`` it allocates, usually as part of its
+own struct. This can be done in a way similar to the following::
+
+ struct my_inode {
+ struct {
+ /* These must be contiguous */
+ struct inode vfs_inode;
+ struct netfs_i_context netfs_ctx;
+ };
+ ...
+ };
+
+This allows netfslib to find its state by simple offset from the inode pointer,
+thereby allowing the netfslib helper functions to be pointed to directly by the
+VFS/VM operation tables.
+
+The structure contains the following fields:
+
+ * ``ops``
+
+ The set of operations provided by the network filesystem to netfslib.
+
+ * ``cache``
+
+ Local caching cookie, or NULL if no caching is enabled. This field does not
+ exist if fscache is disabled.
+
+
+Inode Context Helper Functions
+------------------------------
+
+To help deal with the per-inode context, a number helper functions are
+provided. Firstly, a function to perform basic initialisation on a context and
+set the operations table pointer::
+
+ void netfs_i_context_init(struct inode *inode,
+ const struct netfs_request_ops *ops);
+
+then two functions to cast between the VFS inode structure and the netfs
+context::
+
+ struct netfs_i_context *netfs_i_context(struct inode *inode);
+ struct inode *netfs_inode(struct netfs_i_context *ctx);
+
+and finally, a function to get the cache cookie pointer from the context
+attached to an inode (or NULL if fscache is disabled)::
+
+ struct fscache_cookie *netfs_i_cookie(struct inode *inode);
+
+
Buffered Read Helpers
=====================
@@ -70,38 +135,22 @@ Read Helper Functions
Three read helpers are provided::
- void netfs_readahead(struct readahead_control *ractl,
- const struct netfs_read_request_ops *ops,
- void *netfs_priv);
+ void netfs_readahead(struct readahead_control *ractl);
int netfs_readpage(struct file *file,
- struct folio *folio,
- const struct netfs_read_request_ops *ops,
- void *netfs_priv);
+ struct page *page);
int netfs_write_begin(struct file *file,
struct address_space *mapping,
loff_t pos,
unsigned int len,
unsigned int flags,
struct folio **_folio,
- void **_fsdata,
- const struct netfs_read_request_ops *ops,
- void *netfs_priv);
-
-Each corresponds to a VM operation, with the addition of a couple of parameters
-for the use of the read helpers:
+ void **_fsdata);
- * ``ops``
-
- A table of operations through which the helpers can talk to the filesystem.
-
- * ``netfs_priv``
+Each corresponds to a VM address space operation. These operations use the
+state in the per-inode context.
- Filesystem private data (can be NULL).
-
-Both of these values will be stored into the read request structure.
-
-For ->readahead() and ->readpage(), the network filesystem should just jump
-into the corresponding read helper; whereas for ->write_begin(), it may be a
+For ->readahead() and ->readpage(), the network filesystem just point directly
+at the corresponding read helper; whereas for ->write_begin(), it may be a
little more complicated as the network filesystem might want to flush
conflicting writes or track dirty data and needs to put the acquired folio if
an error occurs after calling the helper.
@@ -116,7 +165,7 @@ occurs, the request will get partially completed if sufficient data is read.
Additionally, there is::
- * void netfs_subreq_terminated(struct netfs_read_subrequest *subreq,
+ * void netfs_subreq_terminated(struct netfs_io_subrequest *subreq,
ssize_t transferred_or_error,
bool was_async);
@@ -132,7 +181,7 @@ Read Helper Structures
The read helpers make use of a couple of structures to maintain the state of
the read. The first is a structure that manages a read request as a whole::
- struct netfs_read_request {
+ struct netfs_io_request {
struct inode *inode;
struct address_space *mapping;
struct netfs_cache_resources cache_resources;
@@ -140,7 +189,7 @@ the read. The first is a structure that manages a read request as a whole::
loff_t start;
size_t len;
loff_t i_size;
- const struct netfs_read_request_ops *netfs_ops;
+ const struct netfs_request_ops *netfs_ops;
unsigned int debug_id;
...
};
@@ -187,8 +236,8 @@ The above fields are the ones the netfs can use. They are:
The second structure is used to manage individual slices of the overall read
request::
- struct netfs_read_subrequest {
- struct netfs_read_request *rreq;
+ struct netfs_io_subrequest {
+ struct netfs_io_request *rreq;
loff_t start;
size_t len;
size_t transferred;
@@ -244,32 +293,26 @@ Read Helper Operations
The network filesystem must provide the read helpers with a table of operations
through which it can issue requests and negotiate::
- struct netfs_read_request_ops {
- void (*init_rreq)(struct netfs_read_request *rreq, struct file *file);
- bool (*is_cache_enabled)(struct inode *inode);
- int (*begin_cache_operation)(struct netfs_read_request *rreq);
- void (*expand_readahead)(struct netfs_read_request *rreq);
- bool (*clamp_length)(struct netfs_read_subrequest *subreq);
- void (*issue_op)(struct netfs_read_subrequest *subreq);
- bool (*is_still_valid)(struct netfs_read_request *rreq);
+ struct netfs_request_ops {
+ void (*init_request)(struct netfs_io_request *rreq, struct file *file);
+ int (*begin_cache_operation)(struct netfs_io_request *rreq);
+ void (*expand_readahead)(struct netfs_io_request *rreq);
+ bool (*clamp_length)(struct netfs_io_subrequest *subreq);
+ void (*issue_read)(struct netfs_io_subrequest *subreq);
+ bool (*is_still_valid)(struct netfs_io_request *rreq);
int (*check_write_begin)(struct file *file, loff_t pos, unsigned len,
struct folio *folio, void **_fsdata);
- void (*done)(struct netfs_read_request *rreq);
+ void (*done)(struct netfs_io_request *rreq);
void (*cleanup)(struct address_space *mapping, void *netfs_priv);
};
The operations are as follows:
- * ``init_rreq()``
+ * ``init_request()``
[Optional] This is called to initialise the request structure. It is given
the file for reference and can modify the ->netfs_priv value.
- * ``is_cache_enabled()``
-
- [Required] This is called by netfs_write_begin() to ask if the file is being
- cached. It should return true if it is being cached and false otherwise.
-
* ``begin_cache_operation()``
[Optional] This is called to ask the network filesystem to call into the
@@ -305,7 +348,7 @@ The operations are as follows:
This should return 0 on success and an error code on error.
- * ``issue_op()``
+ * ``issue_read()``
[Required] The helpers use this to dispatch a subrequest to the server for
reading. In the subrequest, ->start, ->len and ->transferred indicate what
@@ -420,12 +463,12 @@ The network filesystem's ->begin_cache_operation() method is called to set up a
cache and this must call into the cache to do the work. If using fscache, for
example, the cache would call::
- int fscache_begin_read_operation(struct netfs_read_request *rreq,
+ int fscache_begin_read_operation(struct netfs_io_request *rreq,
struct fscache_cookie *cookie);
passing in the request pointer and the cookie corresponding to the file.
-The netfs_read_request object contains a place for the cache to hang its
+The netfs_io_request object contains a place for the cache to hang its
state::
struct netfs_cache_resources {
@@ -443,7 +486,7 @@ operation table looks like the following::
void (*expand_readahead)(struct netfs_cache_resources *cres,
loff_t *_start, size_t *_len, loff_t i_size);
- enum netfs_read_source (*prepare_read)(struct netfs_read_subrequest *subreq,
+ enum netfs_io_source (*prepare_read)(struct netfs_io_subrequest *subreq,
loff_t i_size);
int (*read)(struct netfs_cache_resources *cres,
@@ -562,4 +605,5 @@ API Function Reference
======================
.. kernel-doc:: include/linux/netfs.h
-.. kernel-doc:: fs/netfs/read_helper.c
+.. kernel-doc:: fs/netfs/buffered_read.c
+.. kernel-doc:: fs/netfs/io.c
diff --git a/Documentation/filesystems/porting.rst b/Documentation/filesystems/porting.rst
index bf19fd6b86e7..7c1583dbeb59 100644
--- a/Documentation/filesystems/porting.rst
+++ b/Documentation/filesystems/porting.rst
@@ -45,6 +45,12 @@ typically between calling iget_locked() and unlocking the inode.
At some point that will become mandatory.
+**mandatory**
+
+The foo_inode_info should always be allocated through alloc_inode_sb() rather
+than kmem_cache_alloc() or kmalloc() related to set up the inode reclaim context
+correctly.
+
---
**mandatory**
diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst
index bf5c48066fac..794bd1a66bfb 100644
--- a/Documentation/filesystems/vfs.rst
+++ b/Documentation/filesystems/vfs.rst
@@ -658,7 +658,7 @@ pages, however the address_space has finer control of write sizes.
The read process essentially only requires 'readpage'. The write
process is more complicated and uses write_begin/write_end or
-set_page_dirty to write data into the address_space, and writepage and
+dirty_folio to write data into the address_space, and writepage and
writepages to writeback data to storage.
Adding and removing pages to/from an address_space is protected by the
@@ -724,10 +724,8 @@ cache in your filesystem. The following members are defined:
int (*writepage)(struct page *page, struct writeback_control *wbc);
int (*readpage)(struct file *, struct page *);
int (*writepages)(struct address_space *, struct writeback_control *);
- int (*set_page_dirty)(struct page *page);
+ bool (*dirty_folio)(struct address_space *, struct folio *);
void (*readahead)(struct readahead_control *);
- int (*readpages)(struct file *filp, struct address_space *mapping,
- struct list_head *pages, unsigned nr_pages);
int (*write_begin)(struct file *, struct address_space *mapping,
loff_t pos, unsigned len, unsigned flags,
struct page **pagep, void **fsdata);
@@ -735,7 +733,7 @@ cache in your filesystem. The following members are defined:
loff_t pos, unsigned len, unsigned copied,
struct page *page, void *fsdata);
sector_t (*bmap)(struct address_space *, sector_t);
- void (*invalidatepage) (struct page *, unsigned int, unsigned int);
+ void (*invalidate_folio) (struct folio *, size_t start, size_t len);
int (*releasepage) (struct page *, int);
void (*freepage)(struct page *);
ssize_t (*direct_IO)(struct kiocb *, struct iov_iter *iter);
@@ -745,10 +743,10 @@ cache in your filesystem. The following members are defined:
int (*migratepage) (struct page *, struct page *);
/* put migration-failed page back to right list */
void (*putback_page) (struct page *);
- int (*launder_page) (struct page *);
+ int (*launder_folio) (struct folio *);
- int (*is_partially_uptodate) (struct page *, unsigned long,
- unsigned long);
+ bool (*is_partially_uptodate) (struct folio *, size_t from,
+ size_t count);
void (*is_dirty_writeback) (struct page *, bool *, bool *);
int (*error_remove_page) (struct mapping *mapping, struct page *page);
int (*swap_activate)(struct file *);
@@ -793,34 +791,29 @@ cache in your filesystem. The following members are defined:
This will choose pages from the address space that are tagged as
DIRTY and will pass them to ->writepage.
-``set_page_dirty``
- called by the VM to set a page dirty. This is particularly
- needed if an address space attaches private data to a page, and
- that data needs to be updated when a page is dirtied. This is
+``dirty_folio``
+ called by the VM to mark a folio as dirty. This is particularly
+ needed if an address space attaches private data to a folio, and
+ that data needs to be updated when a folio is dirtied. This is
called, for example, when a memory mapped page gets modified.
- If defined, it should set the PageDirty flag, and the
- PAGECACHE_TAG_DIRTY tag in the radix tree.
+ If defined, it should set the folio dirty flag, and the
+ PAGECACHE_TAG_DIRTY search mark in i_pages.
``readahead``
Called by the VM to read pages associated with the address_space
object. The pages are consecutive in the page cache and are
locked. The implementation should decrement the page refcount
after starting I/O on each page. Usually the page will be
- unlocked by the I/O completion handler. If the filesystem decides
- to stop attempting I/O before reaching the end of the readahead
- window, it can simply return. The caller will decrement the page
- refcount and unlock the remaining pages for you. Set PageUptodate
- if the I/O completes successfully. Setting PageError on any page
- will be ignored; simply unlock the page if an I/O error occurs.
-
-``readpages``
- called by the VM to read pages associated with the address_space
- object. This is essentially just a vector version of readpage.
- Instead of just one page, several pages are requested.
- readpages is only used for read-ahead, so read errors are
- ignored. If anything goes wrong, feel free to give up.
- This interface is deprecated and will be removed by the end of
- 2020; implement readahead instead.
+ unlocked by the I/O completion handler. The set of pages are
+ divided into some sync pages followed by some async pages,
+ rac->ra->async_size gives the number of async pages. The
+ filesystem should attempt to read all sync pages but may decide
+ to stop once it reaches the async pages. If it does decide to
+ stop attempting I/O, it can simply return. The caller will
+ remove the remaining pages from the address space, unlock them
+ and decrement the page refcount. Set PageUptodate if the I/O
+ completes successfully. Setting PageError on any page will be
+ ignored; simply unlock the page if an I/O error occurs.
``write_begin``
Called by the generic buffered write code to ask the filesystem
@@ -868,15 +861,15 @@ cache in your filesystem. The following members are defined:
to find out where the blocks in the file are and uses those
addresses directly.
-``invalidatepage``
- If a page has PagePrivate set, then invalidatepage will be
- called when part or all of the page is to be removed from the
+``invalidate_folio``
+ If a folio has private data, then invalidate_folio will be
+ called when part or all of the folio is to be removed from the
address space. This generally corresponds to either a
truncation, punch hole or a complete invalidation of the address
space (in the latter case 'offset' will always be 0 and 'length'
- will be PAGE_SIZE). Any private data associated with the page
+ will be folio_size()). Any private data associated with the page
should be updated to reflect this truncation. If offset is 0
- and length is PAGE_SIZE, then the private data should be
+ and length is folio_size(), then the private data should be
released, because the page must be able to be completely
discarded. This may be done by calling the ->releasepage
function, but in this case the release MUST succeed.
@@ -930,16 +923,16 @@ cache in your filesystem. The following members are defined:
``putback_page``
Called by the VM when isolated page's migration fails.
-``launder_page``
- Called before freeing a page - it writes back the dirty page.
- To prevent redirtying the page, it is kept locked during the
+``launder_folio``
+ Called before freeing a folio - it writes back the dirty folio.
+ To prevent redirtying the folio, it is kept locked during the
whole operation.
``is_partially_uptodate``
Called by the VM when reading a file through the pagecache when
- the underlying blocksize != pagesize. If the required block is
- up to date then the read can complete without needing the IO to
- bring the whole page up to date.
+ the underlying blocksize is smaller than the size of the folio.
+ If the required block is up to date then the read can complete
+ without needing I/O to bring the whole page up to date.
``is_dirty_writeback``
Called by the VM when attempting to reclaim a page. The VM uses