Forum | Documentation | Website | Blog

Skip to content
Snippets Groups Projects
  • Dave Chinner's avatar
    xfs: skip flushing log items during push · f3f7ae68
    Dave Chinner authored
    
    The AIL pushing code spends a huge amount of time skipping over
    items that are already marked as flushing. It is not uncommon to
    see hundreds of thousands of items skipped every second due to inode
    clustering marking all the inodes in a cluster as flushing when the
    first one is flushed.
    
    However, to discover an item is already flushing and should be
    skipped we have to call the iop_push() method for it to try to flush
    the item. For inodes (where this matters most), we have to first
    check that inode is flushable first.
    
    We can optimise this overhead away by tracking whether the log item
    is flushing internally. This allows xfsaild_push() to check the log
    item directly for flushing state and immediately skip the log item.
    Whilst this doesn't remove the CPU cache misses for loading the log
    item, it does avoid the overhead of an indirect function call
    and the cache misses involved in accessing inode and
    backing cluster buffer structures to determine flushing state. When
    trying to flush hundreds of thousands of inodes each second, this
    CPU overhead saving adds up quickly.
    
    It's so noticeable that the biggest issue with pushing on the AIL on
    fast storage becomes the 10ms back-off wait when we hit enough
    pinned buffers to break out of the push loop but not enough for the
    AIL pushing to be considered stuck. This limits the xfsaild to about
    70% total CPU usage, and on fast storage this isn't enough to keep
    the storage 100% busy.
    
    The xfsaild will block on IO submission on slow storage and so is
    self throttling - it does not need a backoff in the case where we
    are really just breaking out of the walk to submit the IO we have
    gathered.
    
    Further with no backoff we don't need to gather huge delwri lists to
    mitigate the impact of backoffs, so we can submit IO more frequently
    and reduce the time log items spend in flushing state by breaking
    out of the item push loop once we've gathered enough IO to batch
    submission effectively.
    
    Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
    Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
    Signed-off-by: default avatarChandan Babu R <chandanbabu@kernel.org>
    f3f7ae68