[PATCH 3/4] xfs: allow collapse to handle delalloc extents

Discussion:

Brian Foster

2014-08-29 20:29:37 UTC

The xfs_bmap_shift_extents() mechanism currently requires any post-EOF
delalloc blocks to be explicitly trimmed before it can proceed to avoid
hitting delalloc extents during the extent shift sequence. This is
because the implementation doesn't account for delalloc extents and
expects consistency between the in-core extent list and what is on disk
for bmapbt mappings. E.g., it will attempt to look up delalloc extents
on disk and fail.

Since delayed allocation extents have not been written to disk, no
on-disk changes are required to shift such extents. Update
xfs_bmap_shift_extents() to check for delalloc extents. If found, simply
update the start offset of the record and move on. This should only
occur for delalloc blocks that do not have data (e.g., post-eof
preallocation). Such blocks are currently trimmed off by the truncate
that occurs after the extent shift, so we make no attempt to merge
delalloc extents.

This eliminates the need for an eofblocks trim prior to collapse.

Signed-off-by: Brian Foster <***@redhat.com>
---
fs/xfs/libxfs/xfs_bmap.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 449a016..32598cb 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -5417,6 +5417,11 @@ xfs_bmap_shift_extents_can_merge(

startoff = got->br_startoff - shift;

+ /* do not merge if either extent is delalloc */
+ if (isnullstartblock(left->br_startblock) ||
+ isnullstartblock(got->br_startblock))
+ return false;
+
/*
* The extent, once shifted, must be adjacent in-file and on-disk with
* the preceding extent.
@@ -5554,6 +5559,7 @@ xfs_bmap_shift_extents(
int whichfork = XFS_DATA_FORK;
int logflags = 0;
int total_extents;
+ bool delalloc;

if (unlikely(XFS_TEST_ERROR(
(XFS_IFORK_FORMAT(ip, whichfork) != XFS_DINODE_FMT_EXTENTS &&
@@ -5618,6 +5624,7 @@ xfs_bmap_shift_extents(
total_extents = ifp->if_bytes / sizeof(xfs_bmbt_rec_t);
while (nexts++ < num_exts && current_ext < total_extents) {
startoff = got.br_startoff - offset_shift_fsb;
+ delalloc = isnullstartblock(got.br_startblock);

/* grab the left extent and check for a potential merge */
if (current_ext > 0) {
@@ -5649,9 +5656,14 @@ xfs_bmap_shift_extents(

/*
* We didn't merge the extent so do the shift. Update the start
- * offset in the in-core extent and btree, if necessary.
+ * offset in the in-core extent. If the extent is delalloc,
+ * there's nothing else to do. Otherwise, update the extent
+ * on-disk.
*/
xfs_bmbt_set_startoff(gotp, startoff);
+ if (delalloc)
+ goto next;
+
logflags |= XFS_ILOG_CORE;
if (cur) {
error = xfs_bmbt_lookup_eq(cur, got.br_startoff,

--
1.8.3.1

Brian Foster

2014-08-29 20:29:36 UTC

Permalink

The extent shift mechanism in xfs_bmap_shift_extents() is complicated
and handles several different, non-deterministic scenarios. These
include extent shifts, extent merges and potential btree updates in
either of the former scenarios.

Refactor the code to be more linear and readable. The loop logic in
xfs_bmap_shift_extents() and some initial error checking is adjusted
slightly. The associated btree lookup and update/delete operations are
condensed into single blocks of code. This reduces the number of
btree-specific blocks and facilitates the separation of the merge
operation into a new xfs_bmap_shift_extents_merge() helper. The merge
check is also separated into an inline.

This is a code refactor only. The behavior of extent shift and collapse
range is not modified.

Signed-off-by: Brian Foster <***@redhat.com>
---
fs/xfs/libxfs/xfs_bmap.c | 243 ++++++++++++++++++++++++++++++++---------------
1 file changed, 168 insertions(+), 75 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 4b3f1b9..449a016 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -5404,6 +5404,120 @@ error0:
}

/*
+ * Determine whether an extent shift can be accomplished by a merge with the
+ * extent that precedes the target hole of the shift.
+ */
+static inline bool
+xfs_bmap_shift_extents_can_merge(
+ struct xfs_bmbt_irec *left, /* preceding extent */
+ struct xfs_bmbt_irec *got, /* current extent to shift */
+ xfs_fileoff_t shift) /* shift fsb */
+{
+ xfs_fileoff_t startoff;
+
+ startoff = got->br_startoff - shift;
+
+ /*
+ * The extent, once shifted, must be adjacent in-file and on-disk with
+ * the preceding extent.
+ */
+ if ((left->br_startoff + left->br_blockcount != startoff) ||
+ (left->br_startblock + left->br_blockcount != got->br_startblock) ||
+ (left->br_state != got->br_state) ||
+ (left->br_blockcount + got->br_blockcount > MAXEXTLEN))
+ return false;
+
+ return true;
+}
+
+/*
+ * An extent shift adjusts the file offset of an extent to fill a preceding hole
+ * in the file. If an extent shift would result in the extent being fully
+ * adjacent to the extent that currently precedes the hole, we can merge with
+ * the preceding extent rather than do the shift.
+ *
+ * This function assumes the caller has verified a shift-by-merge is possible
+ * with the provided extents via xfs_bmap_shift_extents_can_merge().
+ */
+static int
+xfs_bmap_shift_extents_merge(
+ struct xfs_inode *ip,
+ int whichfork,
+ xfs_fileoff_t shift, /* shift fsb */
+ int current_ext, /* idx of gotp */
+ struct xfs_bmbt_rec_host *gotp, /* extent to shift */
+ struct xfs_bmbt_rec_host *leftp, /* preceding extent */
+ struct xfs_btree_cur *cur,
+ int *logflags) /* output */
+{
+ struct xfs_ifork *ifp;
+ struct xfs_bmbt_irec got;
+ struct xfs_bmbt_irec left;
+ xfs_filblks_t blockcount;
+ int error, i;
+
+ ifp = XFS_IFORK_PTR(ip, whichfork);
+ xfs_bmbt_get_all(gotp, &got);
+ xfs_bmbt_get_all(leftp, &left);
+ blockcount = left.br_blockcount + got.br_blockcount;
+
+ ASSERT(xfs_isilocked(ip, XFS_IOLOCK_EXCL));
+ ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
+ ASSERT(xfs_bmap_shift_extents_can_merge(&left, &got, shift));
+
+ /*
+ * Merge the in-core extents. Note that the host record pointers and
+ * current_ext index are invalid once the extent has been removed via
+ * xfs_iext_remove().
+ */
+ xfs_bmbt_set_blockcount(leftp, blockcount);
+ xfs_iext_remove(ip, current_ext, 1, 0);
+
+ /*
+ * Update the on-disk extent count, the btree if necessary and log the
+ * inode.
+ */
+ XFS_IFORK_NEXT_SET(ip, whichfork,
+ XFS_IFORK_NEXTENTS(ip, whichfork) - 1);
+ *logflags |= XFS_ILOG_CORE;
+ if (cur) {
+ /* lookup and remove the extent to merge */
+ error = xfs_bmbt_lookup_eq(cur, got.br_startoff,
+ got.br_startblock, got.br_blockcount, &i);
+ if (error)
+ goto error;
+ XFS_WANT_CORRUPTED_GOTO(i == 1, error);
+
+ error = xfs_btree_delete(cur, &i);
+ if (error)
+ goto error;
+ XFS_WANT_CORRUPTED_GOTO(i == 1, error);
+
+ /* lookup and update size of the previous extent */
+ error = xfs_bmbt_lookup_eq(cur, left.br_startoff,
+ left.br_startblock, left.br_blockcount, &i);
+ if (error)
+ goto error;
+ XFS_WANT_CORRUPTED_GOTO(i == 1, error);
+
+ left.br_blockcount = blockcount;
+
+ error = xfs_bmbt_update(cur, left.br_startoff,
+ left.br_startblock, left.br_blockcount,
+ left.br_state);
+ if (error)
+ goto error;
+ } else {
+ *logflags |= XFS_ILOG_DEXT;
+ }
+
+ return 0;
+
+error:
+ return error;
+}
+
+/*
* Shift extent records to the left to cover a hole.
*
* The maximum number of extents to be shifted in a single operation is
@@ -5427,6 +5541,7 @@ xfs_bmap_shift_extents(
{
struct xfs_btree_cur *cur = NULL;
struct xfs_bmbt_rec_host *gotp;
+ struct xfs_bmbt_rec_host *leftp;
struct xfs_bmbt_irec got;
struct xfs_bmbt_irec left;
struct xfs_mount *mp = ip->i_mount;
@@ -5438,7 +5553,6 @@ xfs_bmap_shift_extents(
int i;
int whichfork = XFS_DATA_FORK;
int logflags = 0;
- xfs_filblks_t blockcount = 0;
int total_extents;

if (unlikely(XFS_TEST_ERROR(
@@ -5464,6 +5578,13 @@ xfs_bmap_shift_extents(
return error;
}

+ if (ifp->if_flags & XFS_IFBROOT) {
+ cur = xfs_bmbt_init_cursor(mp, tp, ip, whichfork);
+ cur->bc_private.b.firstblock = *firstblock;
+ cur->bc_private.b.flist = flist;
+ cur->bc_private.b.flags = 0;
+ }
+
/*
* Look up the extent index for the fsb where we start shifting. We can
* henceforth iterate with current_ext as extent list changes are locked
@@ -5476,14 +5597,17 @@ xfs_bmap_shift_extents(
gotp = xfs_iext_bno_to_ext(ifp, start_fsb, &current_ext);
if (!gotp) {
*done = 1;
- return 0;
+ goto del_cursor;
}
+ xfs_bmbt_get_all(gotp, &got);

- if (ifp->if_flags & XFS_IFBROOT) {
- cur = xfs_bmbt_init_cursor(mp, tp, ip, whichfork);
- cur->bc_private.b.firstblock = *firstblock;
- cur->bc_private.b.flist = flist;
- cur->bc_private.b.flags = 0;
+ /*
+ * If the first extent is shifted, offset_shift_fsb cannot be larger
+ * than the starting offset of the first extent.
+ */
+ if (current_ext == 0 && got.br_startoff < offset_shift_fsb) {
+ error = -EINVAL;
+ goto del_cursor;
}

/*
@@ -5493,30 +5617,42 @@ xfs_bmap_shift_extents(
*/
total_extents = ifp->if_bytes / sizeof(xfs_bmbt_rec_t);
while (nexts++ < num_exts && current_ext < total_extents) {
-
- gotp = xfs_iext_get_ext(ifp, current_ext);
- xfs_bmbt_get_all(gotp, &got);
startoff = got.br_startoff - offset_shift_fsb;

- /*
- * Before shifting extent into hole, make sure that the hole is
- * large enough to accommodate the shift.
- */
+ /* grab the left extent and check for a potential merge */
if (current_ext > 0) {
- xfs_bmbt_get_all(xfs_iext_get_ext(ifp, current_ext - 1),
- &left);
- if (startoff < left.br_startoff + left.br_blockcount)
+ leftp = xfs_iext_get_ext(ifp, current_ext - 1);
+ xfs_bmbt_get_all(leftp, &left);
+
+ /* make sure hole is large enough for shift */
+ if (startoff < left.br_startoff + left.br_blockcount) {
error = -EINVAL;
- } else if (offset_shift_fsb > got.br_startoff) {
- /*
- * When first extent is shifted, offset_shift_fsb should
- * be less than the stating offset of the first extent.
- */
- error = -EINVAL;
+ goto del_cursor;
+ }
+
+ if (xfs_bmap_shift_extents_can_merge(&left, &got,
+ offset_shift_fsb)) {
+ error = xfs_bmap_shift_extents_merge(ip, whichfork,
+ offset_shift_fsb, current_ext, gotp,
+ leftp, cur, &logflags);
+ if (error)
+ goto del_cursor;
+
+ /*
+ * The extent was merged so adjust the extent
+ * index and move onto the next.
+ */
+ current_ext--;
+ goto next;
+ }
}
- if (error)
- goto del_cursor;

+ /*
+ * We didn't merge the extent so do the shift. Update the start
+ * offset in the in-core extent and btree, if necessary.
+ */
+ xfs_bmbt_set_startoff(gotp, startoff);
+ logflags |= XFS_ILOG_CORE;
if (cur) {
error = xfs_bmbt_lookup_eq(cur, got.br_startoff,
got.br_startblock,
@@ -5525,53 +5661,8 @@ xfs_bmap_shift_extents(
if (error)
goto del_cursor;
XFS_WANT_CORRUPTED_GOTO(i == 1, del_cursor);
- }
-
- /* Check if we can merge 2 adjacent extents */
- if (current_ext &&
- left.br_startoff + left.br_blockcount == startoff &&
- left.br_startblock + left.br_blockcount ==
- got.br_startblock &&
- left.br_state == got.br_state &&
- left.br_blockcount + got.br_blockcount <= MAXEXTLEN) {
- blockcount = left.br_blockcount +
- got.br_blockcount;
- xfs_iext_remove(ip, current_ext, 1, 0);
- logflags |= XFS_ILOG_CORE;
- if (cur) {
- error = xfs_btree_delete(cur, &i);
- if (error)
- goto del_cursor;
- XFS_WANT_CORRUPTED_GOTO(i == 1, del_cursor);
- } else {
- logflags |= XFS_ILOG_DEXT;
- }
- XFS_IFORK_NEXT_SET(ip, whichfork,
- XFS_IFORK_NEXTENTS(ip, whichfork) - 1);
- gotp = xfs_iext_get_ext(ifp, --current_ext);
- xfs_bmbt_get_all(gotp, &got);
-
- /* Make cursor point to the extent we will update */
- if (cur) {
- error = xfs_bmbt_lookup_eq(cur, got.br_startoff,
- got.br_startblock,
- got.br_blockcount,
- &i);
- if (error)
- goto del_cursor;
- XFS_WANT_CORRUPTED_GOTO(i == 1, del_cursor);
- }

- xfs_bmbt_set_blockcount(gotp, blockcount);
- got.br_blockcount = blockcount;
- } else {
- /* We have to update the startoff */
- xfs_bmbt_set_startoff(gotp, startoff);
got.br_startoff = startoff;
- }
-
- logflags |= XFS_ILOG_CORE;
- if (cur) {
error = xfs_bmbt_update(cur, got.br_startoff,
got.br_startblock,
got.br_blockcount,
@@ -5582,18 +5673,20 @@ xfs_bmap_shift_extents(
logflags |= XFS_ILOG_DEXT;
}

- current_ext++;
+next:
+ /* update total extent count and grab the next record */
total_extents = ifp->if_bytes / sizeof(xfs_bmbt_rec_t);
+ if (++current_ext >= total_extents)
+ break;
+ gotp = xfs_iext_get_ext(ifp, current_ext);
+ xfs_bmbt_get_all(gotp, &got);
}

/* Check if we are done */
if (current_ext == total_extents)
*done = 1;
- else if (next_fsb) {
- gotp = xfs_iext_get_ext(ifp, current_ext);
- xfs_bmbt_get_all(gotp, &got);
+ else if (next_fsb)
*next_fsb = got.br_startoff;
- }

del_cursor:
if (cur)

--
1.8.3.1

Brian Foster

2014-08-29 20:29:38 UTC

Permalink

The full writeback and eofblocks trim operations executed prior to
collapse range are temporary hacks to avoid problems in the extent shift
implementation. The writeback prevents the in-core extent count from
changing due to extent conversions in parts of the file prior to the
collapse target. The eofblocks trim prevents collapse failure due to
attempts to look up delalloc extents on disk for bmapbt mappings.

The full writeback is no longer necessary because the extent shift is
now file offset based. Each time the ilock is dropped and reacquired, we
look up the current extent index based on the file offset where the
previous shift left off. The eofblocks trim is no longer required
because the extent shift mechanism knows how to shift delalloc extents.

Remove the writeback and eofblocks trim from xfs_collapse_file_space().

Signed-off-by: Brian Foster <***@redhat.com>
---
fs/xfs/xfs_bmap_util.c | 20 --------------------
1 file changed, 20 deletions(-)

diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 1e96d77..c9d16bd 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -1470,26 +1470,6 @@ xfs_collapse_file_space(
next_fsb = XFS_B_TO_FSB(mp, offset + len);
shift_fsb = XFS_B_TO_FSB(mp, len);

- /*
- * Writeback the entire file and force remove any post-eof blocks. The
- * writeback prevents changes to the extent list via concurrent
- * writeback and the eofblocks trim prevents the extent shift algorithm
- * from running into a post-eof delalloc extent.
- *
- * XXX: This is a temporary fix until the extent shift loop below is
- * converted to use offsets and lookups within the ILOCK rather than
- * carrying around the index into the extent list for the next
- * iteration.
- */
- error = filemap_write_and_wait(VFS_I(ip)->i_mapping);
- if (error)
- return error;
- if (xfs_can_free_eofblocks(ip, true)) {
- error = xfs_free_eofblocks(mp, ip, false);
- if (error)
- return error;
- }
-
error = xfs_free_file_space(ip, offset, len);
if (error)
return error;

--
1.8.3.1

Brian Foster

2014-08-29 20:29:35 UTC

Permalink

The collapse range implementation uses a transaction per extent shift.
The progress of the overall operation is tracked via the current extent
index of the in-core extent list. This is racy because the ilock must be
dropped and reacquired for each transaction according to locking and log
reservation rules. Therefore, writeback to prior regions of the file is
possible and can change the extent count. This changes the extent to
which the current index refers and causes the collapse to fail mid
operation. To avoid this problem, the entire file is currently written
back before the collapse operation starts.

To eliminate the need to flush the entire file, use the file offset
(fsb) to track the progress of the overall extent shift operation rather
than the extent index. Modify xfs_bmap_shift_extents() to
unconditionally convert the start_fsb parameter to an extent index and
return the file offset of the extent where the shift left off, if
further extents exist. The bulk of ths function can remain based on
extent index as ilock is held by the caller. xfs_collapse_file_space()
now uses the fsb output as the starting point for the subsequent shift.

Signed-off-by: Brian Foster <***@redhat.com>
---
fs/xfs/libxfs/xfs_bmap.c | 85 +++++++++++++++++++++++++-----------------------
fs/xfs/libxfs/xfs_bmap.h | 7 ++--
fs/xfs/xfs_bmap_util.c | 12 +++----
3 files changed, 53 insertions(+), 51 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 86df952..4b3f1b9 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -5406,20 +5406,21 @@ error0:
/*
* Shift extent records to the left to cover a hole.
*
- * The maximum number of extents to be shifted in a single operation
- * is @num_exts, and @current_ext keeps track of the current extent
- * index we have shifted. @offset_shift_fsb is the length by which each
- * extent is shifted. If there is no hole to shift the extents
- * into, this will be considered invalid operation and we abort immediately.
+ * The maximum number of extents to be shifted in a single operation is
+ * @num_exts. @start_fsb specifies the file offset to start the shift and the
+ * file offset where we've left off is returned in @next_fsb. @offset_shift_fsb
+ * is the length by which each extent is shifted. If there is no hole to shift
+ * the extents into, this will be considered invalid operation and we abort
+ * immediately.
*/
int
xfs_bmap_shift_extents(
struct xfs_trans *tp,
struct xfs_inode *ip,
- int *done,
xfs_fileoff_t start_fsb,
xfs_fileoff_t offset_shift_fsb,
- xfs_extnum_t *current_ext,
+ int *done,
+ xfs_fileoff_t *next_fsb,
xfs_fsblock_t *firstblock,
struct xfs_bmap_free *flist,
int num_exts)
@@ -5431,6 +5432,7 @@ xfs_bmap_shift_extents(
struct xfs_mount *mp = ip->i_mount;
struct xfs_ifork *ifp;
xfs_extnum_t nexts = 0;
+ xfs_extnum_t current_ext;
xfs_fileoff_t startoff;
int error = 0;
int i;
@@ -5451,7 +5453,8 @@ xfs_bmap_shift_extents(
if (XFS_FORCED_SHUTDOWN(mp))
return -EIO;

- ASSERT(current_ext != NULL);
+ ASSERT(xfs_isilocked(ip, XFS_IOLOCK_EXCL));
+ ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));

ifp = XFS_IFORK_PTR(ip, whichfork);
if (!(ifp->if_flags & XFS_IFEXTENTS)) {
@@ -5462,20 +5465,18 @@ xfs_bmap_shift_extents(
}

/*
- * If *current_ext is 0, we would need to lookup the extent
- * from where we would start shifting and store it in gotp.
+ * Look up the extent index for the fsb where we start shifting. We can
+ * henceforth iterate with current_ext as extent list changes are locked
+ * out via ilock.
+ *
+ * gotp can be null in 2 cases: 1) if there are no extents or 2)
+ * start_fsb lies in a hole beyond which there are no extents. Either
+ * way, we are done.
*/
- if (!*current_ext) {
- gotp = xfs_iext_bno_to_ext(ifp, start_fsb, current_ext);
- /*
- * gotp can be null in 2 cases: 1) if there are no extents
- * or 2) start_fsb lies in a hole beyond which there are
- * no extents. Either way, we are done.
- */
- if (!gotp) {
- *done = 1;
- return 0;
- }
+ gotp = xfs_iext_bno_to_ext(ifp, start_fsb, &current_ext);
+ if (!gotp) {
+ *done = 1;
+ return 0;
}

if (ifp->if_flags & XFS_IFBROOT) {
@@ -5487,36 +5488,32 @@ xfs_bmap_shift_extents(

/*
* There may be delalloc extents in the data fork before the range we
- * are collapsing out, so we cannot
- * use the count of real extents here. Instead we have to calculate it
- * from the incore fork.
+ * are collapsing out, so we cannot use the count of real extents here.
+ * Instead we have to calculate it from the incore fork.
*/
total_extents = ifp->if_bytes / sizeof(xfs_bmbt_rec_t);
- while (nexts++ < num_exts && *current_ext < total_extents) {
+ while (nexts++ < num_exts && current_ext < total_extents) {

- gotp = xfs_iext_get_ext(ifp, *current_ext);
+ gotp = xfs_iext_get_ext(ifp, current_ext);
xfs_bmbt_get_all(gotp, &got);
startoff = got.br_startoff - offset_shift_fsb;

/*
- * Before shifting extent into hole, make sure that the hole
- * is large enough to accomodate the shift.
+ * Before shifting extent into hole, make sure that the hole is
+ * large enough to accommodate the shift.
*/
- if (*current_ext) {
- xfs_bmbt_get_all(xfs_iext_get_ext(ifp,
- *current_ext - 1), &left);
-
+ if (current_ext > 0) {
+ xfs_bmbt_get_all(xfs_iext_get_ext(ifp, current_ext - 1),
+ &left);
if (startoff < left.br_startoff + left.br_blockcount)
error = -EINVAL;
} else if (offset_shift_fsb > got.br_startoff) {
/*
- * When first extent is shifted, offset_shift_fsb
- * should be less than the stating offset of
- * the first extent.
+ * When first extent is shifted, offset_shift_fsb should
+ * be less than the stating offset of the first extent.
*/
error = -EINVAL;
}
-
if (error)
goto del_cursor;

@@ -5531,7 +5528,7 @@ xfs_bmap_shift_extents(
}

/* Check if we can merge 2 adjacent extents */
- if (*current_ext &&
+ if (current_ext &&
left.br_startoff + left.br_blockcount == startoff &&
left.br_startblock + left.br_blockcount ==
got.br_startblock &&
@@ -5539,7 +5536,7 @@ xfs_bmap_shift_extents(
left.br_blockcount + got.br_blockcount <= MAXEXTLEN) {
blockcount = left.br_blockcount +
got.br_blockcount;
- xfs_iext_remove(ip, *current_ext, 1, 0);
+ xfs_iext_remove(ip, current_ext, 1, 0);
logflags |= XFS_ILOG_CORE;
if (cur) {
error = xfs_btree_delete(cur, &i);
@@ -5551,7 +5548,7 @@ xfs_bmap_shift_extents(
}
XFS_IFORK_NEXT_SET(ip, whichfork,
XFS_IFORK_NEXTENTS(ip, whichfork) - 1);
- gotp = xfs_iext_get_ext(ifp, --*current_ext);
+ gotp = xfs_iext_get_ext(ifp, --current_ext);
xfs_bmbt_get_all(gotp, &got);

/* Make cursor point to the extent we will update */
@@ -5585,13 +5582,18 @@ xfs_bmap_shift_extents(
logflags |= XFS_ILOG_DEXT;
}

- (*current_ext)++;
+ current_ext++;
total_extents = ifp->if_bytes / sizeof(xfs_bmbt_rec_t);
}

/* Check if we are done */
- if (*current_ext == total_extents)
+ if (current_ext == total_extents)
*done = 1;
+ else if (next_fsb) {
+ gotp = xfs_iext_get_ext(ifp, current_ext);
+ xfs_bmbt_get_all(gotp, &got);
+ *next_fsb = got.br_startoff;
+ }

del_cursor:
if (cur)
@@ -5600,5 +5602,6 @@ del_cursor:

if (logflags)
xfs_trans_log_inode(tp, ip, logflags);
+
return error;
}
diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h
index b879ca5..44db6db 100644
--- a/fs/xfs/libxfs/xfs_bmap.h
+++ b/fs/xfs/libxfs/xfs_bmap.h
@@ -178,9 +178,8 @@ int xfs_check_nostate_extents(struct xfs_ifork *ifp, xfs_extnum_t idx,
xfs_extnum_t num);
uint xfs_default_attroffset(struct xfs_inode *ip);
int xfs_bmap_shift_extents(struct xfs_trans *tp, struct xfs_inode *ip,
- int *done, xfs_fileoff_t start_fsb,
- xfs_fileoff_t offset_shift_fsb, xfs_extnum_t *current_ext,
- xfs_fsblock_t *firstblock, struct xfs_bmap_free *flist,
- int num_exts);
+ xfs_fileoff_t start_fsb, xfs_fileoff_t offset_shift_fsb,
+ int *done, xfs_fileoff_t *next_fsb, xfs_fsblock_t *firstblock,
+ struct xfs_bmap_free *flist, int num_exts);

#endif /* __XFS_BMAP_H__ */
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 1707980..1e96d77 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -1456,18 +1456,18 @@ xfs_collapse_file_space(
struct xfs_mount *mp = ip->i_mount;
struct xfs_trans *tp;
int error;
- xfs_extnum_t current_ext = 0;
struct xfs_bmap_free free_list;
xfs_fsblock_t first_block;
int committed;
xfs_fileoff_t start_fsb;
+ xfs_fileoff_t next_fsb;
xfs_fileoff_t shift_fsb;

ASSERT(xfs_isilocked(ip, XFS_IOLOCK_EXCL));

trace_xfs_collapse_file_space(ip);

- start_fsb = XFS_B_TO_FSB(mp, offset + len);
+ next_fsb = XFS_B_TO_FSB(mp, offset + len);
shift_fsb = XFS_B_TO_FSB(mp, len);

/*
@@ -1525,10 +1525,10 @@ xfs_collapse_file_space(
* We are using the write transaction in which max 2 bmbt
* updates are allowed
*/
- error = xfs_bmap_shift_extents(tp, ip, &done, start_fsb,
- shift_fsb, &current_ext,
- &first_block, &free_list,
- XFS_BMAP_MAX_SHIFT_EXTENTS);
+ start_fsb = next_fsb;
+ error = xfs_bmap_shift_extents(tp, ip, start_fsb, shift_fsb,
+ &done, &next_fsb, &first_block, &free_list,
+ XFS_BMAP_MAX_SHIFT_EXTENTS);
if (error)
goto out;

--
1.8.3.1

Brian Foster

2014-09-02 19:16:20 UTC

Permalink

Here's a drop of what I'm testing over the weekend. It passes some quick
tests, but only lightly tested so far.

fsx has been running for over 3 days without a failure and I've kicked
off a parallel fsstress today that so far hasn't caused any problems.
Given that, I think the patches here help reduce the likelihood of
errors from collapse.

My biggest question at this point is whether there is any risk to the
shift of the post-eof extent if the subsequent truncate happens to fail.
If it's not worth dealing with that, we could just drop patch 3 and
leave the eofblocks trim permanently.

That said, a test to skip the post-collapse truncate confirms that the
behavior of patch 3 is potentially problematic in the event of failure.
E.g., there is no behavior analogous to the zeroing of prealloc space
for extending truncates.

I think the right thing to do here might be for the collapse to
writeback and invalidate the range following the space that was freed to
EOF and to retain the eofblocks trim.

Brian

In fact, it might even be a good idea to go back to the original
[start,-1] writeback in collapse range given that the free file space
helper could change at some point to only write and punch out the range
being freed... and then we're left shifting a bunch of extents after the
freed range that could have dirty data in pagecache, which sounds bad.
Thoughts?
Brian
xfs: track collapse via file offset rather than extent index
xfs: refactor xfs_bmap_shift_extents() into multiple functions
xfs: allow collapse to handle delalloc extents
xfs: remove file writeback and eofblocks trim from collapse range
fs/xfs/libxfs/xfs_bmap.c | 302 ++++++++++++++++++++++++++++++++---------------
fs/xfs/libxfs/xfs_bmap.h | 7 +-
fs/xfs/xfs_bmap_util.c | 32 +----
3 files changed, 214 insertions(+), 127 deletions(-)
--
1.8.3.1
_______________________________________________
xfs mailing list
http://oss.sgi.com/mailman/listinfo/xfs