Corrupted files

Discussion:

Corrupted files

Leslie Rhorer

2014-09-09 15:21:37 UTC

Hello,

I have an issue with my primary RAID array. I have 13T of data on the
array, and I suffered a major array failure. I was able to rebuild the
array, but some data was lost. Of course I have backups, so after
running xfs_repair, I ran an rsync job to recover the lost data. Most
of it was recovered, but there are several files that cannot be read,
deleted, or overwritten. I have tried running xfs_repair several times,
but any attempt to access these files continuously reports "cannot stat
XXXXXXXX: Structure needs cleaning". I don't need to try to recover the
data directly, as it does reside on the backup, but I need to clear the
file structure so I can write the files back to the filesystem. How do
I proceed?

Sean Caron

2014-09-09 15:50:28 UTC

Hi Leslie,

If you have a full backup, I would STRONGLY recommend just wiping your old
filesystem and restoring your backups on top of a totally fresh XFS, rather
than repairing the original filesystem and then filling in the blanks with
backups using a file-diff tool like rsync.

You will probably hear various opinions here about xfs_repair; my personal
opinion is that xfs_repair is a program made available for the unwary to
further scramble their data and make a hash of the filesystem... In my
first-hand experience managing ~7 PB of XFS storage and growing, I have
NEVER found xfs_repair (yes, even the "newest version") to ever do anything
positive. It's basically a data scrambler.

At this point, you will never achieve anything near what I'd consider a
production-grade, trustworthy data repository. Any further runs of
xfs_repair will either do nothing, or make the situation worse. Fortunately
you followed best practice and kept backups so you don't really need
xfs_repair anyway, right?

Best,

Sean

P.S. No backups? Still don't even think about running xfs_repair.
ESPECIALLY don't think about running xfs_repair. Try mounting ro; if that
doesn't work, mount ro with noreplaylog and scavenge what you can. Write
off the rest. That's the cost of doing business without backups. Running
xfs_repair (especially as a first-line step) will only make it worse, and
especially on big filesystems, the run time can extend to weeks... Don't
keep your users down any longer than you need to, running a program that
won't really help you. Just scavenge it, reformat and turn it back around.

Post by Leslie Rhorer
Hello,
I have an issue with my primary RAID array. I have 13T of data on
the array, and I suffered a major array failure. I was able to rebuild the
array, but some data was lost. Of course I have backups, so after running
xfs_repair, I ran an rsync job to recover the lost data. Most of it was
recovered, but there are several files that cannot be read, deleted, or
overwritten. I have tried running xfs_repair several times, but any
Structure needs cleaning". I don't need to try to recover the data
directly, as it does reside on the backup, but I need to clear the file
structure so I can write the files back to the filesystem. How do I
proceed?
_______________________________________________
xfs mailing list
http://oss.sgi.com/mailman/listinfo/xfs

Sean Caron

2014-09-09 16:03:56 UTC

OK, let me retract just a tiny fraction of what I said originally; thinking
about it further, there was _one_ time I was able to use xfs_repair to
successfully recover a "lightly bruised" XFS and return it to service. But
in that case, the fault was very minor and I always check first with:

xfs_repair [-L] -n -v <filesystem>

and give the output a good looking over before proceeding further.

If it won't run without zeroing the log, you can take that as a sign that
things are getting dire.. I wouldn't bother to run xfs_repair "for real" if
the trial output looked even slightly non-trivial, in cases of underlying
array failure or massive filesystem corruption, and I'd never run it
without mounting and scavenging first (unless I had a very recent full
backup). Barring rare cases, xfs_repair is bad juju.

Best,

Sean

Post by Sean Caron
Hi Leslie,
If you have a full backup, I would STRONGLY recommend just wiping your old
filesystem and restoring your backups on top of a totally fresh XFS, rather
than repairing the original filesystem and then filling in the blanks with
backups using a file-diff tool like rsync.
You will probably hear various opinions here about xfs_repair; my personal
opinion is that xfs_repair is a program made available for the unwary to
further scramble their data and make a hash of the filesystem... In my
first-hand experience managing ~7 PB of XFS storage and growing, I have
NEVER found xfs_repair (yes, even the "newest version") to ever do anything
positive. It's basically a data scrambler.
At this point, you will never achieve anything near what I'd consider a
production-grade, trustworthy data repository. Any further runs of
xfs_repair will either do nothing, or make the situation worse. Fortunately
you followed best practice and kept backups so you don't really need
xfs_repair anyway, right?
Best,
Sean
P.S. No backups? Still don't even think about running xfs_repair.
ESPECIALLY don't think about running xfs_repair. Try mounting ro; if that
doesn't work, mount ro with noreplaylog and scavenge what you can. Write
off the rest. That's the cost of doing business without backups. Running
xfs_repair (especially as a first-line step) will only make it worse, and
especially on big filesystems, the run time can extend to weeks... Don't
keep your users down any longer than you need to, running a program that
won't really help you. Just scavenge it, reformat and turn it back around.

Post by Leslie Rhorer
Hello,
I have an issue with my primary RAID array. I have 13T of data
on the array, and I suffered a major array failure. I was able to rebuild
the array, but some data was lost. Of course I have backups, so after
running xfs_repair, I ran an rsync job to recover the lost data. Most of
it was recovered, but there are several files that cannot be read, deleted,
or overwritten. I have tried running xfs_repair several times, but any
Structure needs cleaning". I don't need to try to recover the data
directly, as it does reside on the backup, but I need to clear the file
structure so I can write the files back to the filesystem. How do I
proceed?
_______________________________________________
xfs mailing list
http://oss.sgi.com/mailman/listinfo/xfs

Eric Sandeen

2014-09-09 22:24:55 UTC

Post by Sean Caron
Barring rare cases, xfs_repair is bad juju.

No, it's not. It is the appropriate tool to use for filesystem repair.

But it is not the appropriate tool for recovery from mangled storage.

I've actually been running a filesystem fuzzer over xfs images, randomly
corrupting data and testing repair, 1000s of times over. It does
remarkably well.

If you scramble your raid, which means your block device is no longer
an xfs filesystem, but is instead a random tangle of bits and pieces of
other things, of course xfs_repair won't do well, but it's not the right
tool for the job at that stage.

-Eric

Sean Caron

2014-09-09 22:57:06 UTC

Hey, just sharing some hard-won (believe me) professional experience. I
have seen xfs_repair take a bad situation and make it worse many times. I
don't know that a filesystem fuzzer or any other simulation can ever
provide true simulation of users absolutely pounding the tar out of a
system. There seems to be a real disconnect between what developers are
able to test and observe directly, and what happens in the production
environment in a very high-throughput environment.

Best,

Sean

Post by Sean Caron
Barring rare cases, xfs_repair is bad juju.
No, it's not. It is the appropriate tool to use for filesystem repair.
But it is not the appropriate tool for recovery from mangled storage.
I've actually been running a filesystem fuzzer over xfs images, randomly
corrupting data and testing repair, 1000s of times over. It does
remarkably well.
If you scramble your raid, which means your block device is no longer
an xfs filesystem, but is instead a random tangle of bits and pieces of
other things, of course xfs_repair won't do well, but it's not the right
tool for the job at that stage.
-Eric

Roger Willcocks

2014-09-10 01:00:18 UTC

I normally watch quietly from the sidelines but I think it's important to get some balance here; our customers between them run many hundreds of multi-terabyte arrays and when something goes badly awry it generally falls to me to sort it out. In my experience xfs_repair does exactly what it says on the tin.

I can recall only a couple of instances where we elected to reformat and reload from backups and they were both due to human error: somebody deleted the wrong raid unit when doing routine maintenance, and then tried to fix it up hemselves.

In theory of course xfs_repair shouldn't be needed if the write barriers work properly (it's a journalled filesystem), but low-level corruption does creep in due to power failures / kernel crashes and it's this which xfs_repair is intended to address; not massive data corruption due to failed hardware or careless users.

--
Roger

Hey, just sharing some hard-won (believe me) professional experience. I have seen xfs_repair take a bad situation and make it worse many times. I don't know that a filesystem fuzzer or any other simulation can ever provide true simulation of users absolutely pounding the tar out of a system. There seems to be a real disconnect between what developers are able to test and observe directly, and what happens in the production environment in a very high-throughput environment.
Best,
Sean
Barring rare cases, xfs_repair is bad juju.
No, it's not. It is the appropriate tool to use for filesystem repair.
But it is not the appropriate tool for recovery from mangled storage.
I've actually been running a filesystem fuzzer over xfs images, randomly
corrupting data and testing repair, 1000s of times over. It does
remarkably well.
If you scramble your raid, which means your block device is no longer
an xfs filesystem, but is instead a random tangle of bits and pieces of
other things, of course xfs_repair won't do well, but it's not the right
tool for the job at that stage.
-Eric
_______________________________________________
xfs mailing list
http://oss.sgi.com/mailman/listinfo/xfs

Leslie Rhorer

2014-09-10 01:23:51 UTC

This post might be inappropriate. Click to display it.

Eric Sandeen

2014-09-10 05:09:08 UTC

Post by Sean Caron
Hey, just sharing some hard-won (believe me) professional experience.
I have seen xfs_repair take a bad situation and make it worse many
times. I don't know that a filesystem fuzzer or any other simulation
can ever provide true simulation of users absolutely pounding the tar
out of a system. There seems to be a real disconnect between what
developers are able to test and observe directly, and what happens in
the production environment in a very high-throughput environment.
Best,
Sean

Fair enough, but I don't want to let stand an assertion that you should
avoid xfs_repair at all (most) costs. It, like almost any software,
has some bugs, but they don't get fixed if they don't get well reported.
We do our best to improve it when we get useful reports from
users - usually including a metadata dump - and we beat on it as best
we can in the lab.

"pounding the tar out of a filesystem" should not, in general, require
an xfs_repair run. ;)

Yes, it's always good advice to do a dry run before committing to a
repair, in case something goes off the rails. But most times I've seen
things go very very badly was when the storage device under the filesystem
was no longer consistent, and the filesystem really had no pieces to
pick up.

-Eric

Leslie Rhorer

2014-09-10 00:48:41 UTC

Post by Eric Sandeen

Post by Sean Caron
Barring rare cases, xfs_repair is bad juju.

No, it's not. It is the appropriate tool to use for filesystem repair.
But it is not the appropriate tool for recovery from mangled storage.

It's not all that mangled. Out of over 52,000 files on the backup
server array, only 5758 were missing from the primary array, and most of
those were lost by the corruption of just a couple of directories, where
every file in the directory was lost with the directory itself. Several
directories and a scattering of individual files were deleted with
intent prior to the failure but not yet purged from the backup. Most
were small files - only 29 were larger than 1G. All of those 5758 were
easily recovered. The only ones remaining at issue are 3 files which
cannot be read, written or deleted. The rest have been read and
checksums sucessfully computed and compared. With only 50K files in
question, I am confidant any checksum collisions are of insignificant
probability. Someone is going to have to do a lot of talking to
convince me rsync can read two copies of what should be the same data
and come up with the same checksum value for both, but other
applications would be able to successfully read one of the files and not
the other.

I really don't think Draconian measures are required. Even if it turns
out they are, the existence of the backup allows for a good deal of
fiddling with the main filesystem before one is compelled to give up and
start fresh. This especially since a small amount of the data on the
main array had not yet been backed up to the secondary array. These
e-mails, for example. The rsync job that backs up the main array runs
every morning at 04:00, so files created that day were not backed up,
and for safety I have changed the backup array file system to read-only,
so nothing created since is backed up.

Post by Eric Sandeen
I've actually been running a filesystem fuzzer over xfs images, randomly
corrupting data and testing repair, 1000s of times over. It does
remarkably well.
If you scramble your raid, which means your block device is no longer
an xfs filesystem, but is instead a random tangle of bits and pieces of
other things, of course xfs_repair won't do well, but it's not the right
tool for the job at that stage.

This is nowhere near that stage. A few sectors here and there were
lost because 3 drives were kicked from the array while write operations
were underway. I had to force re-assemble the array, which lost some
data. The vast majority of the data is clearly intact, including most
of the file system structures. Far less than 1% of the data was lost or
corrupted.

Roger Willcocks

2014-09-10 01:10:47 UTC

The only ones remaining at issue are 3 files which cannot be read, written or deleted.

The most straightforward fix would be to note down the inode numbers of the three fies and then use xfs_db to clear the inodes; then run xfs_repair again.

See:

http://xfs.org/index.php/XFS_FAQ#Q:_How_to_get_around_a_bad_inode_repair_is_unable_to_clean_up

but before that try running the latest (3.2.1 I think) xfs_repair.

--
Roger

Leslie Rhorer

2014-09-10 01:31:07 UTC

Post by Roger Willcocks

The only ones remaining at issue are 3 files which cannot be read, written or deleted.

The most straightforward fix would be to note down the inode numbers of the three fies and then use xfs_db to clear the inodes; then run xfs_repair again.
http://xfs.org/index.php/XFS_FAQ#Q:_How_to_get_around_a_bad_inode_repair_is_unable_to_clean_up

That sounds reasonable. If no one has any more sound advice, I think I
will try that.

Post by Roger Willcocks
but before that try running the latest (3.2.1 I think) xfs_repair.

I am always reticent to run anything outside the distro package. Ive
had problems in the past with doing so. 3.1.7 is pretty close, so
unless there is a really solid reason to use 3.2.1 vs. 3.1.7, I think I
will stick with the distro version and try the above. Can you or anyone
else give a reason why 3.2.1 would work when 3.1.7 would not? More
importantly, is there some reason 3.1.7 would make things worse while
3.2.1 would not? If not, then I can always try 3.1.7 and then try 3.2.1
if that does not help.

Emmanuel Florac

2014-09-10 14:24:11 UTC

Le Tue, 09 Sep 2014 20:31:07 -0500

Post by Leslie Rhorer
More
importantly, is there some reason 3.1.7 would make things worse while
3.2.1 would not? If not, then I can always try 3.1.7 and then try
3.2.1 if that does not help.

I don't know for these particular versions, however in the past
I've confirmed that a later version of xfs_repair performed way better
(salvaged more files from lost+found, in particular).

At some point in the distant past, some versions of xfs_repair were
buggy and would happily throw away TB of perfectly sane data... Ih ad
this very problem once on Christmas eve in 2005 IIRC :/

--
------------------------------------------------------------------------
Emmanuel Florac | Direction technique
| Intellique
| <***@intellique.com>
| +33 1 78 94 84 02
------------------------------------------------------------------------

Sean Caron

2014-09-10 14:49:32 UTC

I don't want to bloviate too much and drag this completely off topic esp.
since the OPs query is resolved but please allow me just one anecdote :)

Earlier this year, I had one of our project file servers (450 TB) go down.
It didn't go down because the array spuriously just lost a bunch of disks;
it was simply your usual sort of Linux kernel panic... you go to the
console and it's just black screen and unresponsive, or maybe you can see
the tail end of a backtrace and it's unresponsive. So, OK, issue a quick
remote IPMI reboot of the machine, it comes up...

I'm in single user mode, bringing up each sub-RAID6 in our RAID60 by hand,
no problem. Bring up the top level RAID0. OK. Then I go to mount the XFS...
no go. Apparently the log somehow got corrupted in the crash?

So I try to mount ro, no dice, but I _can_ mount ro,noreplaylog and I see
good files here! Thank goodness. I start scavenging to a spare host...

A few weeks later, after the scavenge is done, I did a few xfs_repair runs
just for the sake of experimentation. Using both in dry run mode, I tried
the version that shipped with Ubuntu 12.04, as well as the latest
xfs_repair I could pull from the source tree. I redirected the output of
both runs to file and watched them with 'tail -f'.

Diffing the output when they were done, it didn't look like they were
behaving much differently. Both files had thousands or tens of thousands of
lines worth of output in them, bad this, bad that... (I always run in
verbose mode) Since the filesystem was hosed anyway and I was going to
rebuild it, I decided to let the new xfs_repair run "for real" just to see
what would happen, for kicks. And who knows? Maybe I could recover even
more than I already had ...? (I wasn't just totally wasting time)

I think it took maybe a week for it to run on a 450 TB volume? At least a
week. Maybe I was being a teensy bit hyperbolic in my previous descriptions
of runtime, LOL. After it was done?

... almost everything was obliterated. I had tens of millions of
zero-length files, and tens of millions of bits of anonymous scrambled junk
in lost+found.

So, I chuckled a bit (thankful for my hard-won previous experience) before
reformatting the array and then copied back the results of my scavenging.
Just by ro-mounting and copying what I could, I was able to save around 90%
of the data by volume on the array (it was a little more than half full
when it failed... ~290 TB? There was only ~30 TB that I couldn't salvage);
good clean files that passed validation from their respective users. I
think 80-90% recovery rates are very commonly achievable just mounting
ro,noreplaylog and getting what you can with cp -R or rsync, given that
there wasn't grievous failure of the underlying storage system.

If I had depended on xfs_repair, or blithely run it as a first line of
response as the documentation might intimate (hey, it's called xfs_repair,
right?) like you would casually think to do; run it like people run fsck or
CHKDSK... I would have been hosed, big time.

Best,

Sean

Post by Emmanuel Florac
Le Tue, 09 Sep 2014 20:31:07 -0500

Post by Leslie Rhorer
More
importantly, is there some reason 3.1.7 would make things worse while
3.2.1 would not? If not, then I can always try 3.1.7 and then try
3.2.1 if that does not help.

I don't know for these particular versions, however in the past
I've confirmed that a later version of xfs_repair performed way better
(salvaged more files from lost+found, in particular).
At some point in the distant past, some versions of xfs_repair were
buggy and would happily throw away TB of perfectly sane data... Ih ad
this very problem once on Christmas eve in 2005 IIRC :/
--
------------------------------------------------------------------------
Emmanuel Florac | Direction technique
| Intellique
| +33 1 78 94 84 02
------------------------------------------------------------------------

Emmanuel Florac

2014-09-09 16:08:04 UTC

Le Tue, 09 Sep 2014 10:21:37 -0500

Post by Leslie Rhorer
I have tried running xfs_repair several times,
but any attempt to access these files continuously reports "cannot
stat XXXXXXXX: Structure needs cleaning".

I won't agree with Sean here(1). Most of the time xfs_repair ends with
the expected result; however many distros (particularly centOS) provide
positively ancient versions. You'd better grab a recent version (3.1 or
better).

(1) in particular on the "run for weeks" part. I've never had
xfs_repair take more than a couple of hours, even on badly damaged
filesystems in the hundred of terabytes range.

--
------------------------------------------------------------------------
Emmanuel Florac | Direction technique
| Intellique
| <***@intellique.com>
| +33 1 78 94 84 02
------------------------------------------------------------------------

Dave Chinner

2014-09-09 22:06:45 UTC

Post by Leslie Rhorer
Hello,
I have an issue with my primary RAID array. I have 13T of data on
the array, and I suffered a major array failure. I was able to
rebuild the array, but some data was lost. Of course I have
backups, so after running xfs_repair, I ran an rsync job to recover
the lost data. Most of it was recovered, but there are several
files that cannot be read, deleted, or overwritten. I have tried
running xfs_repair several times, but any attempt to access these
files continuously reports "cannot stat XXXXXXXX: Structure needs
cleaning". I don't need to try to recover the data directly, as it
does reside on the backup, but I need to clear the file structure so
I can write the files back to the filesystem. How do I proceed?

Fristly, more infomration is required, namely versions and actual
error messages:

http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F

dmesg, in particular, should tell use what the corruption being
encountered is when stat fails.

Cheers,

Dave.

--
Dave Chinner
***@fromorbit.com

Leslie Rhorer

2014-09-10 01:12:38 UTC

Post by Dave Chinner
Fristly, more infomration is required, namely versions and actual

Indubitably:

RAID-Server:/# xfs_repair -V
xfs_repair version 3.1.7
RAID-Server:/# uname -r
3.2.0-4-amd64

4.0 GHz FX-8350 eight core processor

RAID-Server:/# cat /proc/meminfo /proc/mounts /proc/partitions
MemTotal: 8099916 kB
MemFree: 5786420 kB
Buffers: 112684 kB
Cached: 457020 kB
SwapCached: 0 kB
Active: 521800 kB
Inactive: 457268 kB
Active(anon): 276648 kB
Inactive(anon): 140180 kB
Active(file): 245152 kB
Inactive(file): 317088 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 12623740 kB
SwapFree: 12623740 kB
Dirty: 20 kB
Writeback: 0 kB
AnonPages: 409488 kB
Mapped: 47576 kB
Shmem: 7464 kB
Slab: 197100 kB
SReclaimable: 112644 kB
SUnreclaim: 84456 kB
KernelStack: 2560 kB
PageTables: 8468 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 16673696 kB
Committed_AS: 1010172 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 339140 kB
VmallocChunk: 34359395308 kB
HardwareCorrupted: 0 kB
AnonHugePages: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 65532 kB
DirectMap2M: 5120000 kB
DirectMap1G: 3145728 kB
rootfs / rootfs rw 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
udev /dev devtmpfs rw,relatime,size=10240k,nr_inodes=1002653,mode=755 0 0
devpts /dev/pts devpts
rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,noexec,relatime,size=809992k,mode=755 0 0
/dev/disk/by-uuid/fa5c404a-bfcb-43de-87ed-e671fda1ba99 / ext4
rw,relatime,errors=remount-ro,user_xattr,barrier=1,data=ordered 0 0
tmpfs /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
tmpfs /run/shm tmpfs rw,nosuid,nodev,noexec,relatime,size=4144720k 0 0
/dev/md1 /boot ext2 rw,relatime,errors=continue 0 0
rpc_pipefs /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
Backup:/Backup /Backup nfs
rw,relatime,vers=3,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.1.51,mountvers=3,mountport=39597,mountproto=tcp,local_lock=none,addr=192.168.1.51
0 0
Backup:/var/www /var/www/backup nfs
rw,relatime,vers=3,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.1.51,mountvers=3,mountport=39597,mountproto=tcp,local_lock=none,addr=192.168.1.51
0 0
/dev/md0 /RAID xfs
rw,relatime,attr2,delaylog,sunit=2048,swidth=12288,noquota 0 0
major minor #blocks name

8 0 125034840 sda
8 1 96256 sda1
8 2 112305152 sda2
8 3 12632064 sda3
8 16 125034840 sdb
8 17 96256 sdb1
8 18 112305152 sdb2
8 19 12632064 sdb3
8 48 3907018584 sdd
8 32 3907018584 sdc
8 64 1465138584 sde
8 80 1465138584 sdf
8 96 1465138584 sdg
8 112 3907018584 sdh
8 128 3907018584 sdi
8 144 3907018584 sdj
8 160 3907018584 sdk
9 1 96192 md1
9 2 112239488 md2
9 3 12623744 md3
9 0 23441319936 md0
9 10 4395021312 md10

RAID-Server:/# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [raid1] [raid0]
md10 : active raid0 sdf[0] sde[2] sdg[1]
4395021312 blocks super 1.2 512k chunks

md0 : active raid6 md10[12] sdc[13] sdk[10] sdj[11] sdi[15] sdh[8] sdd[9]
23441319936 blocks super 1.2 level 6, 1024k chunk, algorithm 2
[8/7] [UUU_UUUU]
bitmap: 29/30 pages [116KB], 65536KB chunk

md3 : active (auto-read-only) raid1 sda3[0] sdb3[1]
12623744 blocks super 1.2 [3/2] [UU_]
bitmap: 1/1 pages [4KB], 65536KB chunk

md2 : active raid1 sda2[0] sdb2[1]
112239488 blocks super 1.2 [3/2] [UU_]
bitmap: 1/1 pages [4KB], 65536KB chunk

md1 : active raid1 sda1[0] sdb1[1]
96192 blocks [3/2] [UU_]
bitmap: 1/1 pages [4KB], 65536KB chunk

unused devices: <none>

Six of the drives are 4T spindles (a mixture of makes and models). The
three drives comprising MD10 are WD 1.5T green drives. These are in
place to take over the function of one of the kicked 4T drives. Md1, 2,
and 3 are not data drives and are not suffering any issue.

I'm not sure what is meant by "write cache status" in this context.
The machine has been rebooted more than once during recovery and the FS
has been umounted and xfs_repair run several times.

I don't know for what the acronym BBWC stands.

RAID-Server:/# xfs_info /dev/md0
meta-data=/dev/md0 isize=256 agcount=43,
agsize=137356288 blks
= sectsz=512 attr=2
data = bsize=4096 blocks=5860329984, imaxpct=5
= sunit=256 swidth=1536 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal bsize=4096 blocks=521728, version=2
= sectsz=512 sunit=8 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0

The system performs just fine, other than the aforementioned, with
loads in excess of 3Gbps. That is internal only. The LAN link is ony
1Gbps, so no external request exceeds about 950Mbps.

Post by Dave Chinner
http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
dmesg, in particular, should tell use what the corruption being
encountered is when stat fails.

RAID-Server:/# ls "/RAID/DVD/Big Sleep, The (1945)/VIDEO_TS/VTS_01_1.VOB"
ls: cannot access /RAID/DVD/Big Sleep, The (1945)/VIDEO_TS/VTS_01_1.VOB:
Structure needs cleaning
RAID-Server:/# dmesg | tail -n 30
...
[192173.363981] XFS (md0): corrupt dinode 41006, extent total = 1,
nblocks = 0.
[192173.363988] ffff8802338b8e00: 49 4e 81 b6 02 02 00 00 00 00 03 e8 00
00 03 e8 IN..............
[192173.363996] XFS (md0): Internal error xfs_iformat(1) at line 319 of
file /build/linux-eKuxrT/linux-3.2.60/fs/xfs/xfs_inode.c. Caller
0xffffffffa0509318
[192173.363999]
[192173.364062] Pid: 10813, comm: ls Not tainted 3.2.0-4-amd64 #1 Debian
3.2.60-1+deb7u3
[192173.364065] Call Trace:
[192173.364097] [<ffffffffa04d3731>] ? xfs_corruption_error+0x54/0x6f [xfs]
[192173.364134] [<ffffffffa0509318>] ? xfs_iread+0x9f/0x177 [xfs]
[192173.364170] [<ffffffffa0508efa>] ? xfs_iformat+0xe3/0x462 [xfs]
[192173.364204] [<ffffffffa0509318>] ? xfs_iread+0x9f/0x177 [xfs]
[192173.364240] [<ffffffffa0509318>] ? xfs_iread+0x9f/0x177 [xfs]
[192173.364268] [<ffffffffa04d6ebe>] ? xfs_iget+0x37c/0x56c [xfs]
[192173.364300] [<ffffffffa04e13b4>] ? xfs_lookup+0xa4/0xd3 [xfs]
[192173.364328] [<ffffffffa04d9e5a>] ? xfs_vn_lookup+0x3f/0x7e [xfs]
[192173.364344] [<ffffffff81102de9>] ? d_alloc_and_lookup+0x3a/0x60
[192173.364357] [<ffffffff8110388d>] ? walk_component+0x219/0x406
[192173.364370] [<ffffffff81104721>] ? path_lookupat+0x7c/0x2bd
[192173.364383] [<ffffffff81036628>] ? should_resched+0x5/0x23
[192173.364396] [<ffffffff8134f144>] ? _cond_resched+0x7/0x1c
[192173.364408] [<ffffffff8110497e>] ? do_path_lookup+0x1c/0x87
[192173.364420] [<ffffffff81106407>] ? user_path_at_empty+0x47/0x7b
[192173.364434] [<ffffffff813533d8>] ? do_page_fault+0x30a/0x345
[192173.364448] [<ffffffff810d6a04>] ? mmap_region+0x353/0x44a
[192173.364460] [<ffffffff810fe45a>] ? vfs_fstatat+0x32/0x60
[192173.364471] [<ffffffff810fe590>] ? sys_newstat+0x12/0x2b
[192173.364483] [<ffffffff813509f5>] ? page_fault+0x25/0x30
[192173.364495] [<ffffffff81355452>] ? system_call_fastpath+0x16/0x1b
[192173.364503] XFS (md0): Corruption detected. Unmount and run xfs_repair

That last line, by the way, is why I ran umount and xfs_repair.

Sean Caron

2014-09-10 01:25:37 UTC

Hi Leslie,

You really don't want to be running "green" anything in an array... that is
a ticking time bomb just waiting to go off... let me tell you... At my
installation, a predecessor had procured a large number of green drives
because they were very inexpensive and regrets were had by all. Lousy
performance, lots of spurious ejection/RAID gremlins and the failure rate
on the WDC Greens is just appalling...

BBWC stands for Battery Backed Write Cache; this is a feature of hardware
RAID cards; it is just like it says on the tin; a bit (usually half a gig,
or a gig, or two...) of nonvolatile cache that retains writes to the array
in case of power failure, etc. If you have BBWC enabled but your battery is
dead, bad things can happen. Not applicable for JBOD software RAID.

I hold firm to my beliefs on xfs_repair :) As I say, you'll see a variety
of opinions here.

Best,

Sean

Post by Leslie Rhorer

Post by Dave Chinner
Fristly, more infomration is required, namely versions and actual

RAID-Server:/# xfs_repair -V
xfs_repair version 3.1.7
RAID-Server:/# uname -r
3.2.0-4-amd64
4.0 GHz FX-8350 eight core processor
RAID-Server:/# cat /proc/meminfo /proc/mounts /proc/partitions
MemTotal: 8099916 kB
MemFree: 5786420 kB
Buffers: 112684 kB
Cached: 457020 kB
SwapCached: 0 kB
Active: 521800 kB
Inactive: 457268 kB
Active(anon): 276648 kB
Inactive(anon): 140180 kB
Active(file): 245152 kB
Inactive(file): 317088 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 12623740 kB
SwapFree: 12623740 kB
Dirty: 20 kB
Writeback: 0 kB
AnonPages: 409488 kB
Mapped: 47576 kB
Shmem: 7464 kB
Slab: 197100 kB
SReclaimable: 112644 kB
SUnreclaim: 84456 kB
KernelStack: 2560 kB
PageTables: 8468 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 16673696 kB
Committed_AS: 1010172 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 339140 kB
VmallocChunk: 34359395308 kB
HardwareCorrupted: 0 kB
AnonHugePages: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 65532 kB
DirectMap2M: 5120000 kB
DirectMap1G: 3145728 kB
rootfs / rootfs rw 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
udev /dev devtmpfs rw,relatime,size=10240k,nr_inodes=1002653,mode=755 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000
0 0
tmpfs /run tmpfs rw,nosuid,noexec,relatime,size=809992k,mode=755 0 0
/dev/disk/by-uuid/fa5c404a-bfcb-43de-87ed-e671fda1ba99 / ext4
rw,relatime,errors=remount-ro,user_xattr,barrier=1,data=ordered 0 0
tmpfs /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
tmpfs /run/shm tmpfs rw,nosuid,nodev,noexec,relatime,size=4144720k 0 0
/dev/md1 /boot ext2 rw,relatime,errors=continue 0 0
rpc_pipefs /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
Backup:/Backup /Backup nfs rw,relatime,vers=3,rsize=
524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,
retrans=2,sec=sys,mountaddr=192.168.1.51,mountvers=3,
mountport=39597,mountproto=tcp,local_lock=none,addr=192.168.1.51 0 0
Backup:/var/www /var/www/backup nfs rw,relatime,vers=3,rsize=
524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,
retrans=2,sec=sys,mountaddr=192.168.1.51,mountvers=3,
mountport=39597,mountproto=tcp,local_lock=none,addr=192.168.1.51 0 0
/dev/md0 /RAID xfs rw,relatime,attr2,delaylog,sunit=2048,swidth=12288,noquota
0 0
major minor #blocks name
8 0 125034840 sda
8 1 96256 sda1
8 2 112305152 sda2
8 3 12632064 sda3
8 16 125034840 sdb
8 17 96256 sdb1
8 18 112305152 sdb2
8 19 12632064 sdb3
8 48 3907018584 sdd
8 32 3907018584 sdc
8 64 1465138584 sde
8 80 1465138584 sdf
8 96 1465138584 sdg
8 112 3907018584 sdh
8 128 3907018584 sdi
8 144 3907018584 sdj
8 160 3907018584 sdk
9 1 96192 md1
9 2 112239488 md2
9 3 12623744 md3
9 0 23441319936 md0
9 10 4395021312 md10
RAID-Server:/# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [raid1] [raid0]
md10 : active raid0 sdf[0] sde[2] sdg[1]
4395021312 blocks super 1.2 512k chunks
md0 : active raid6 md10[12] sdc[13] sdk[10] sdj[11] sdi[15] sdh[8] sdd[9]
23441319936 blocks super 1.2 level 6, 1024k chunk, algorithm 2 [8/7]
[UUU_UUUU]
bitmap: 29/30 pages [116KB], 65536KB chunk
md3 : active (auto-read-only) raid1 sda3[0] sdb3[1]
12623744 blocks super 1.2 [3/2] [UU_]
bitmap: 1/1 pages [4KB], 65536KB chunk
md2 : active raid1 sda2[0] sdb2[1]
112239488 blocks super 1.2 [3/2] [UU_]
bitmap: 1/1 pages [4KB], 65536KB chunk
md1 : active raid1 sda1[0] sdb1[1]
96192 blocks [3/2] [UU_]
bitmap: 1/1 pages [4KB], 65536KB chunk
unused devices: <none>
Six of the drives are 4T spindles (a mixture of makes and
models). The three drives comprising MD10 are WD 1.5T green drives. These
are in place to take over the function of one of the kicked 4T drives.
Md1, 2, and 3 are not data drives and are not suffering any issue.
I'm not sure what is meant by "write cache status" in this
context. The machine has been rebooted more than once during recovery and
the FS has been umounted and xfs_repair run several times.
I don't know for what the acronym BBWC stands.
RAID-Server:/# xfs_info /dev/md0
meta-data=/dev/md0 isize=256 agcount=43, agsize=137356288
blks
= sectsz=512 attr=2
data = bsize=4096 blocks=5860329984, imaxpct=5
= sunit=256 swidth=1536 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal bsize=4096 blocks=521728, version=2
= sectsz=512 sunit=8 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
The system performs just fine, other than the aforementioned, with
loads in excess of 3Gbps. That is internal only. The LAN link is ony
1Gbps, so no external request exceeds about 950Mbps.
http://xfs.org/index.php/XFS_FAQ#Q:_What_information_

Post by Dave Chinner
should_I_include_when_reporting_a_problem.3F
dmesg, in particular, should tell use what the corruption being
encountered is when stat fails.

RAID-Server:/# ls "/RAID/DVD/Big Sleep, The (1945)/VIDEO_TS/VTS_01_1.VOB"
Structure needs cleaning
RAID-Server:/# dmesg | tail -n 30
...
[192173.363981] XFS (md0): corrupt dinode 41006, extent total = 1, nblocks
= 0.
[192173.363988] ffff8802338b8e00: 49 4e 81 b6 02 02 00 00 00 00 03 e8 00
00 03 e8 IN..............
[192173.363996] XFS (md0): Internal error xfs_iformat(1) at line 319 of
file /build/linux-eKuxrT/linux-3.2.60/fs/xfs/xfs_inode.c. Caller
0xffffffffa0509318
[192173.363999]
[192173.364062] Pid: 10813, comm: ls Not tainted 3.2.0-4-amd64 #1 Debian
3.2.60-1+deb7u3
[192173.364097] [<ffffffffa04d3731>] ? xfs_corruption_error+0x54/0x6f [xfs]
[192173.364134] [<ffffffffa0509318>] ? xfs_iread+0x9f/0x177 [xfs]
[192173.364170] [<ffffffffa0508efa>] ? xfs_iformat+0xe3/0x462 [xfs]
[192173.364204] [<ffffffffa0509318>] ? xfs_iread+0x9f/0x177 [xfs]
[192173.364240] [<ffffffffa0509318>] ? xfs_iread+0x9f/0x177 [xfs]
[192173.364268] [<ffffffffa04d6ebe>] ? xfs_iget+0x37c/0x56c [xfs]
[192173.364300] [<ffffffffa04e13b4>] ? xfs_lookup+0xa4/0xd3 [xfs]
[192173.364328] [<ffffffffa04d9e5a>] ? xfs_vn_lookup+0x3f/0x7e [xfs]
[192173.364344] [<ffffffff81102de9>] ? d_alloc_and_lookup+0x3a/0x60
[192173.364357] [<ffffffff8110388d>] ? walk_component+0x219/0x406
[192173.364370] [<ffffffff81104721>] ? path_lookupat+0x7c/0x2bd
[192173.364383] [<ffffffff81036628>] ? should_resched+0x5/0x23
[192173.364396] [<ffffffff8134f144>] ? _cond_resched+0x7/0x1c
[192173.364408] [<ffffffff8110497e>] ? do_path_lookup+0x1c/0x87
[192173.364420] [<ffffffff81106407>] ? user_path_at_empty+0x47/0x7b
[192173.364434] [<ffffffff813533d8>] ? do_page_fault+0x30a/0x345
[192173.364448] [<ffffffff810d6a04>] ? mmap_region+0x353/0x44a
[192173.364460] [<ffffffff810fe45a>] ? vfs_fstatat+0x32/0x60
[192173.364471] [<ffffffff810fe590>] ? sys_newstat+0x12/0x2b
[192173.364483] [<ffffffff813509f5>] ? page_fault+0x25/0x30
[192173.364495] [<ffffffff81355452>] ? system_call_fastpath+0x16/0x1b
[192173.364503] XFS (md0): Corruption detected. Unmount and run xfs_repair
That last line, by the way, is why I ran umount and xfs_repair.
_______________________________________________
xfs mailing list
http://oss.sgi.com/mailman/listinfo/xfs

Leslie Rhorer

2014-09-10 01:43:08 UTC

Post by Sean Caron
Hi Leslie,
You really don't want to be running "green" anything in an array... that
is a ticking time bomb just waiting to go off... let me tell you... At
my installation, a predecessor had procured a large number of green
drives because they were very inexpensive and regrets were had by all.

The alternative is nothing at all. I am not a company, just a guy with
a couple of arrays at his house. 'Not a rich guy, either.

I've had these arrays since 2001 with only one other mass drive
failure, and that was not unrecoverable, nor were they "green" drives.
(Four Seagate drives all suddenly decided they did not want to be part
of the array, so md kicked all four simultaneously. After that, they
would not stay up as part of the array long enough to be mounted. I was
able to read all four with dd_rescue, and get the array back online
without a single lost file.

Note also these arrays are not usually under any sort of massive load.
The bulk of the data is video files which are written once at about
80MBps and then read one-by-one at about 4MBps.

Post by Sean Caron
Lousy performance, lots of spurious ejection/RAID gremlins and the
failure rate on the WDC Greens is just appalling...

None of the failed drives were WD green. All three and the previous
four were Seagate. I realize that is not a large statistical sample.

Post by Sean Caron
BBWC stands for Battery Backed Write Cache; this is a feature of
hardware RAID cards

Ah, yes. This array does not have a BBWC controller. The backup array
does, actually, but the battery backup is disabled.

Post by Sean Caron
it is just like it says on the tin; a bit (usually
half a gig, or a gig, or two...) of nonvolatile cache that retains
writes to the array in case of power failure, etc. If you have BBWC
enabled but your battery is dead, bad things can happen. Not applicable
for JBOD software RAID.

Exactly. All the arrays are JBOD / mdadm.

Emmanuel Florac

2014-09-10 14:31:24 UTC

Le Tue, 09 Sep 2014 20:43:08 -0500

Post by Leslie Rhorer
None of the failed drives were WD green. All three and the
previous four were Seagate. I realize that is not a large
statistical sample.

If you're interested in large statistical samples, on a grand total of
4000 1 TB Seagate Barracuda ES2, I had to replace 2100 of them over the
course of 3 years. I still have a couple of hundred of these
unfortunate pieces of crap in service, and they still represent the
vast majority of unexpected RAID malfunctions, urgent replacements,
late night calls and other "interesting side activities".

I wouldn't buy anything labeled Seagate nowadays. Their drives have
been the baddest train wreck since the dreaded 9 GB Micropolis back in
1994 (or was it 1995?).

--
------------------------------------------------------------------------
Emmanuel Florac | Direction technique
| Intellique
| <***@intellique.com>
| +33 1 78 94 84 02
------------------------------------------------------------------------

Grozdan

2014-09-10 14:52:27 UTC

Post by Emmanuel Florac
Le Tue, 09 Sep 2014 20:43:08 -0500

Post by Leslie Rhorer
None of the failed drives were WD green. All three and the
previous four were Seagate. I realize that is not a large
statistical sample.

If you're interested in large statistical samples, on a grand total of
4000 1 TB Seagate Barracuda ES2, I had to replace 2100 of them over the
course of 3 years. I still have a couple of hundred of these
unfortunate pieces of crap in service, and they still represent the
vast majority of unexpected RAID malfunctions, urgent replacements,
late night calls and other "interesting side activities".
I wouldn't buy anything labeled Seagate nowadays. Their drives have
been the baddest train wreck since the dreaded 9 GB Micropolis back in
1994 (or was it 1995?).

Funny, because our server (105 of them) all run on Seagate drives a
few years now and I have yet to see one fail or cause other problems.
But then again, we use Constellation disks, not Barracuda's. At home I
also use both Barracuda's and Constellation ones and also have yet to
see a problem with them.

Post by Emmanuel Florac
--
------------------------------------------------------------------------
Emmanuel Florac | Direction technique
| Intellique
| +33 1 78 94 84 02
------------------------------------------------------------------------
_______________________________________________
xfs mailing list
http://oss.sgi.com/mailman/listinfo/xfs

--
Yours truly

_______________________________________________
xfs mailing list

Emmanuel Florac

2014-09-10 15:12:08 UTC

Le Wed, 10 Sep 2014 16:52:27 +0200

Post by Grozdan
Funny, because our server (105 of them) all run on Seagate drives a
few years now and I have yet to see one fail or cause other problems.
But then again, we use Constellation disks, not Barracuda's. At home I
also use both Barracuda's and Constellation ones and also have yet to
see a problem with them.

Yes, we replaced most failed Barracudas with Constellations at a
later stage (because the "certified repaired" Barracudas aren't any
better... ) and these work fine so far. However, why would I give
Seagate my hard-earned money after they cost me so dearly for years? :)

--
------------------------------------------------------------------------
Emmanuel Florac | Direction technique
| Intellique
| <***@intellique.com>
| +33 1 78 94 84 02
------------------------------------------------------------------------

Grozdan

2014-09-10 15:32:17 UTC

Post by Emmanuel Florac
Le Wed, 10 Sep 2014 16:52:27 +0200

Post by Grozdan
Funny, because our server (105 of them) all run on Seagate drives a
few years now and I have yet to see one fail or cause other problems.
But then again, we use Constellation disks, not Barracuda's. At home I
also use both Barracuda's and Constellation ones and also have yet to
see a problem with them.

Yes, we replaced most failed Barracudas with Constellations at a
later stage (because the "certified repaired" Barracudas aren't any
better... ) and these work fine so far. However, why would I give
Seagate my hard-earned money after they cost me so dearly for years? :)

Oh, you are correct about the money. If it happened to us I'll also
think twice about that too. The biggest problems thus far we had were
with Samsung disks. I haven't seen such a high fail rate in all my
life. About 70% of the 100 disks we got failed within a year. Too bad
Seagate took them over. I can only hope that Seagate's manufacturing
an QA doesn't suffer because of that.

Post by Emmanuel Florac
--
------------------------------------------------------------------------
Emmanuel Florac | Direction technique
| Intellique
| +33 1 78 94 84 02
------------------------------------------------------------------------

--
Yours truly

Sean Caron

2014-09-10 14:54:33 UTC

I am probably overseeing a similar number (3-4000) of Hitachi A7K3000s,
A7K2000s and WDC RE4s and I probably see a few failures a month. When we
are building a new machine and we get a fresh shipment in, maybe 10%
failure rate right out of the box. Those that survive the burn-in usually
do pretty good. Man, you have my sympathy with that failure rate in excess
of 50%... even the WDC Greens weren't THAT bad (although it probably got
close, as we neared closer and closer to EOLing them... and they had been
moved to third-tier "backup storage" status by that point). Thankfully they
are gone now, LOL.

You're right, esp. in large installations, it's critical to do your
homework on drives, pick a good candidate, validate it and then run with
them. Even with the good ones, you've gotta keep a watchful eye... "when
you buy them in bulk, they fail in bulk".

Best,

Sean

Post by Emmanuel Florac
Le Tue, 09 Sep 2014 20:43:08 -0500

Post by Leslie Rhorer
None of the failed drives were WD green. All three and the
previous four were Seagate. I realize that is not a large
statistical sample.

If you're interested in large statistical samples, on a grand total of
4000 1 TB Seagate Barracuda ES2, I had to replace 2100 of them over the
course of 3 years. I still have a couple of hundred of these
unfortunate pieces of crap in service, and they still represent the
vast majority of unexpected RAID malfunctions, urgent replacements,
late night calls and other "interesting side activities".
I wouldn't buy anything labeled Seagate nowadays. Their drives have
been the baddest train wreck since the dreaded 9 GB Micropolis back in
1994 (or was it 1995?).
--
------------------------------------------------------------------------
Emmanuel Florac | Direction technique
| Intellique
| +33 1 78 94 84 02
------------------------------------------------------------------------

Leslie Rhorer

2014-09-10 23:18:38 UTC

Post by Emmanuel Florac
Le Tue, 09 Sep 2014 20:43:08 -0500

Post by Leslie Rhorer
None of the failed drives were WD green. All three and the
previous four were Seagate. I realize that is not a large
statistical sample.

If you're interested in large statistical samples, on a grand total of
4000 1 TB Seagate Barracuda ES2, I had to replace 2100 of them over the
course of 3 years.

That's a good sized statistical sample. Oddly enough, perhaps, the
ones that failed on me were also 1T Barracuda drives, and my failure
rate was 40%.

Greg Freemyer

2014-09-11 13:24:04 UTC

On Wed, Sep 10, 2014 at 10:31 AM, Emmanuel Florac

Post by Emmanuel Florac
Le Tue, 09 Sep 2014 20:43:08 -0500

Post by Leslie Rhorer
None of the failed drives were WD green. All three and the
previous four were Seagate. I realize that is not a large
statistical sample.

If you're interested in large statistical samples, on a grand total of
4000 1 TB Seagate Barracuda ES2, I had to replace 2100 of them over the
course of 3 years. I still have a couple of hundred of these
unfortunate pieces of crap in service, and they still represent the
vast majority of unexpected RAID malfunctions, urgent replacements,
late night calls and other "interesting side activities".
I wouldn't buy anything labeled Seagate nowadays. Their drives have
been the baddest train wreck since the dreaded 9 GB Micropolis back in
1994 (or was it 1995?).

I buy about 100 drives a year, but I don't work them very hard. Just
lots of data to store and I need to keep my data sets segregated for
legal reasons. I don't use raid, just lots of individual disks and
most data maintained redundantly.

About 4 years ago (or maybe 5), Seagate had a catastrophic drive
situation. I can remember buying a batch of 10 drives and having 8 of
them fail in the first 2 months. The bad part was they mostly
survived a 10 hour burn-in, so they tended to fail with real data on
them. I had one case (at a minimum) that summer where I put the data
on 3 different Seagate drives and all 3 failed.

Fortunately, I was able to swap the disk controller card from one of
the working drives with one of the dead drives and recover the data.

Regardless, ignoring the summer of discontent, I find Seagate to be my
preferred drives.

fyi: In June I bought 30 or so WD elements drives to try them out.
These are not the green drives, just bare bones WD drives. None of
them were DOA, but 3 failed within 4 weeks, so a 10% failure rate in
the first month. Only one of them had unique data on it, so I had to
recreate that data. Fortunately the source of the data was still
available. All of those drives have been pulled out of routine
service.

Greg

Emmanuel Florac

2014-09-12 07:06:42 UTC

Post by Greg Freemyer
Regardless, ignoring the summer of discontent, I find Seagate to be my
preferred drives.

Nowadays I only buy HGST drives. The 3 TB aren't as reliable as the
1, 2, 4 and 6 TB, but generally speaking the failure rate is extremely
low (an order of a few failures a year among several thousands units).

--
------------------------------------------------------------------------
Emmanuel Florac | Direction technique
| Intellique
| <***@intellique.com>
| +33 1 78 94 84 02
------------------------------------------------------------------------

Dave Chinner

2014-09-10 01:53:31 UTC

Post by Leslie Rhorer

Post by Dave Chinner
Fristly, more infomration is required, namely versions and actual

RAID-Server:/# xfs_repair -V
xfs_repair version 3.1.7
RAID-Server:/# uname -r
3.2.0-4-amd64

Ok, so a relatively old xfs_repair. That's important - read on....

Post by Leslie Rhorer
4.0 GHz FX-8350 eight core processor
RAID-Server:/# cat /proc/meminfo /proc/mounts /proc/partitions
MemTotal: 8099916 kB

....

Post by Leslie Rhorer
/dev/md0 /RAID xfs
rw,relatime,attr2,delaylog,sunit=2048,swidth=12288,noquota 0 0

FWIW, you don't need sunit=2048,swidth=12288 in the mount options -
they are stored on disk and the mount options are only necessray to
change the on-disk values.

Post by Leslie Rhorer
Personalities : [raid6] [raid5] [raid4] [raid1] [raid0]
md10 : active raid0 sdf[0] sde[2] sdg[1]
4395021312 blocks super 1.2 512k chunks
md0 : active raid6 md10[12] sdc[13] sdk[10] sdj[11] sdi[15] sdh[8] sdd[9]
23441319936 blocks super 1.2 level 6, 1024k chunk, algorithm 2
[8/7] [UUU_UUUU]
bitmap: 29/30 pages [116KB], 65536KB chunk
md3 : active (auto-read-only) raid1 sda3[0] sdb3[1]
12623744 blocks super 1.2 [3/2] [UU_]
bitmap: 1/1 pages [4KB], 65536KB chunk
md2 : active raid1 sda2[0] sdb2[1]
112239488 blocks super 1.2 [3/2] [UU_]
bitmap: 1/1 pages [4KB], 65536KB chunk
md1 : active raid1 sda1[0] sdb1[1]
96192 blocks [3/2] [UU_]
bitmap: 1/1 pages [4KB], 65536KB chunk
unused devices: <none>
Six of the drives are 4T spindles (a mixture of makes and models).
The three drives comprising MD10 are WD 1.5T green drives. These
are in place to take over the function of one of the kicked 4T
drives. Md1, 2, and 3 are not data drives and are not suffering any
issue.

Ok, that's creative. But when you need another drive in the array
and you don't have the right spares.... ;)

Post by Leslie Rhorer
I'm not sure what is meant by "write cache status" in this context.
The machine has been rebooted more than once during recovery and the
FS has been umounted and xfs_repair run several times.

Start here and read the next few entries:

http://xfs.org/index.php/XFS_FAQ#Q:_What_is_the_problem_with_the_write_cache_on_journaled_filesystems.3F

Post by Leslie Rhorer
I don't know for what the acronym BBWC stands.

"battery backed write cache". If you're not using a hardware RAID
controller, it's unlikely you have one. The difference between a
drive write cache and a BBWC is that the BBWC is non-volatile - it
does not get lost when power drops.

Post by Leslie Rhorer
RAID-Server:/# xfs_info /dev/md0
meta-data=/dev/md0 isize=256 agcount=43,
agsize=137356288 blks
= sectsz=512 attr=2
data = bsize=4096 blocks=5860329984, imaxpct=5
= sunit=256 swidth=1536 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal bsize=4096 blocks=521728, version=2
= sectsz=512 sunit=8 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0

Ok, that all looks pretty good, and the sunit/swidth match the mount
options you set so you definitely don't need the mount options...

Post by Leslie Rhorer
The system performs just fine, other than the aforementioned, with
loads in excess of 3Gbps. That is internal only. The LAN link is
ony 1Gbps, so no external request exceeds about 950Mbps.

Post by Dave Chinner
http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
dmesg, in particular, should tell use what the corruption being
encountered is when stat fails.

RAID-Server:/# ls "/RAID/DVD/Big Sleep, The (1945)/VIDEO_TS/VTS_01_1.VOB"
ls: cannot access /RAID/DVD/Big Sleep, The
(1945)/VIDEO_TS/VTS_01_1.VOB: Structure needs cleaning
RAID-Server:/# dmesg | tail -n 30
...
[192173.363981] XFS (md0): corrupt dinode 41006, extent total = 1,
nblocks = 0.
[192173.363988] ffff8802338b8e00: 49 4e 81 b6 02 02 00 00 00 00 03
e8 00 00 03 e8 IN..............
[192173.363996] XFS (md0): Internal error xfs_iformat(1) at line 319
of file /build/linux-eKuxrT/linux-3.2.60/fs/xfs/xfs_inode.c. Caller
0xffffffffa0509318
[192173.363999]
[192173.364062] Pid: 10813, comm: ls Not tainted 3.2.0-4-amd64 #1
Debian 3.2.60-1+deb7u3
[192173.364097] [<ffffffffa04d3731>] ? xfs_corruption_error+0x54/0x6f [xfs]
[192173.364134] [<ffffffffa0509318>] ? xfs_iread+0x9f/0x177 [xfs]
[192173.364170] [<ffffffffa0508efa>] ? xfs_iformat+0xe3/0x462 [xfs]
[192173.364204] [<ffffffffa0509318>] ? xfs_iread+0x9f/0x177 [xfs]
[192173.364240] [<ffffffffa0509318>] ? xfs_iread+0x9f/0x177 [xfs]
[192173.364268] [<ffffffffa04d6ebe>] ? xfs_iget+0x37c/0x56c [xfs]
[192173.364300] [<ffffffffa04e13b4>] ? xfs_lookup+0xa4/0xd3 [xfs]
[192173.364328] [<ffffffffa04d9e5a>] ? xfs_vn_lookup+0x3f/0x7e [xfs]
[192173.364344] [<ffffffff81102de9>] ? d_alloc_and_lookup+0x3a/0x60
[192173.364357] [<ffffffff8110388d>] ? walk_component+0x219/0x406
[192173.364370] [<ffffffff81104721>] ? path_lookupat+0x7c/0x2bd
[192173.364383] [<ffffffff81036628>] ? should_resched+0x5/0x23
[192173.364396] [<ffffffff8134f144>] ? _cond_resched+0x7/0x1c
[192173.364408] [<ffffffff8110497e>] ? do_path_lookup+0x1c/0x87
[192173.364420] [<ffffffff81106407>] ? user_path_at_empty+0x47/0x7b
[192173.364434] [<ffffffff813533d8>] ? do_page_fault+0x30a/0x345
[192173.364448] [<ffffffff810d6a04>] ? mmap_region+0x353/0x44a
[192173.364460] [<ffffffff810fe45a>] ? vfs_fstatat+0x32/0x60
[192173.364471] [<ffffffff810fe590>] ? sys_newstat+0x12/0x2b
[192173.364483] [<ffffffff813509f5>] ? page_fault+0x25/0x30
[192173.364495] [<ffffffff81355452>] ? system_call_fastpath+0x16/0x1b
[192173.364503] XFS (md0): Corruption detected. Unmount and run xfs_repair
That last line, by the way, is why I ran umount and xfs_repair.

Right, that's the correct thing to do, but sometimes there are
issues that repair doesn't handle properly. This *was* one of them,
and it was fixed by commit e1f43b4 ("repair: update extent count
after zapping duplicate blocks") which was added to xfs_repair
v3.1.8.

IOWs, upgrading xfsprogs to the latest release and re-running
xfs_repair should fix this error.

Cheers,

Dave.

--
Dave Chinner
***@fromorbit.com

Leslie Rhorer

2014-09-10 03:10:45 UTC

Post by Dave Chinner

Post by Leslie Rhorer

Post by Dave Chinner
Fristly, more infomration is required, namely versions and actual

RAID-Server:/# xfs_repair -V
xfs_repair version 3.1.7
RAID-Server:/# uname -r
3.2.0-4-amd64

Ok, so a relatively old xfs_repair. That's important - read on....

OK, a good reason is a good reason.

Post by Dave Chinner

Post by Leslie Rhorer
4.0 GHz FX-8350 eight core processor
RAID-Server:/# cat /proc/meminfo /proc/mounts /proc/partitions
MemTotal: 8099916 kB

....

Post by Leslie Rhorer
/dev/md0 /RAID xfs
rw,relatime,attr2,delaylog,sunit=2048,swidth=12288,noquota 0 0

FWIW, you don't need sunit=2048,swidth=12288 in the mount options -
they are stored on disk and the mount options are only necessray to
change the on-disk values.

They aren't. Those were created automatically, weather at creation
time or at mount time, I don't know, but the filesystem was created with

mkfs.xfs /dev/md0

and fstab contains:

/dev/md0 /RAID xfs rw 1 2

Post by Dave Chinner

Post by Leslie Rhorer
Six of the drives are 4T spindles (a mixture of makes and models).
The three drives comprising MD10 are WD 1.5T green drives. These
are in place to take over the function of one of the kicked 4T
drives. Md1, 2, and 3 are not data drives and are not suffering any
issue.

Ok, that's creative. But when you need another drive in the array
and you don't have the right spares.... ;)

Yes, but I wasn't really expecting to need 3 spares this soon or
suddenly. These are fairly new drives, and with 33% of the array being
parity, the sudden need for 3 extra drives just is not too likely.
That, plus I have quite a few 1.5 and 1.0T drives lying around in case
of sudden emergency. This isn't the first time I've replaced a single
drive temporarily with a RAID0. The performance is actually better, of
course, and for the 3 or 4 days it takes to get a new drive, it's really
not an issue. Since I have a full online backup system plus a regularly
updated off-site backup, the risk is quite minimal. This is an exercise
in mild inconvenience, not an emergency failure. If this were a
commercial system, it would be another matter, but I know for a fact
there are a very large number of home NAS solutions in place that are
less robust than this one. I personally know quite a few people who
never do backups, at all.

Post by Dave Chinner

Post by Leslie Rhorer
I'm not sure what is meant by "write cache status" in this context.
The machine has been rebooted more than once during recovery and the
FS has been umounted and xfs_repair run several times.

http://xfs.org/index.php/XFS_FAQ#Q:_What_is_the_problem_with_the_write_cache_on_journaled_filesystems.3F

I knew that, but I still don't see the relevance in this context.
There is no battery backup on the drive controller or the drives, and
the drives have all been powered down and back up several times.
Anything in any cache right now would be from some operation in the last
few minutes, not four days ago.

Post by Dave Chinner

Post by Leslie Rhorer
I don't know for what the acronym BBWC stands.

"battery backed write cache". If you're not using a hardware RAID
controller, it's unlikely you have one.

See my previous. I do have one (a 3Ware 9650E, given to me by a friend
when his company switched to zfs for their server). It's not on this
system. This array is on a HighPoint RocketRAID 2722.

Post by Dave Chinner
The difference between a
drive write cache and a BBWC is that the BBWC is non-volatile - it
does not get lost when power drops.

Yeah, I'm aware, thanks. I just didn't cotton to the acronym.

Post by Dave Chinner

Post by Leslie Rhorer
RAID-Server:/# xfs_info /dev/md0
meta-data=/dev/md0 isize=256 agcount=43,
agsize=137356288 blks
= sectsz=512 attr=2
data = bsize=4096 blocks=5860329984, imaxpct=5
= sunit=256 swidth=1536 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal bsize=4096 blocks=521728, version=2
= sectsz=512 sunit=8 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0

Ok, that all looks pretty good, and the sunit/swidth match the mount
options you set so you definitely don't need the mount options...

Yeah, I didn't set them. What did, I don't really know for certain.
See above.

Post by Dave Chinner

Post by Leslie Rhorer
[192173.364460] [<ffffffff810fe45a>] ? vfs_fstatat+0x32/0x60
[192173.364471] [<ffffffff810fe590>] ? sys_newstat+0x12/0x2b
[192173.364483] [<ffffffff813509f5>] ? page_fault+0x25/0x30
[192173.364495] [<ffffffff81355452>] ? system_call_fastpath+0x16/0x1b
[192173.364503] XFS (md0): Corruption detected. Unmount and run xfs_repair
That last line, by the way, is why I ran umount and xfs_repair.

Right, that's the correct thing to do, but sometimes there are
issues that repair doesn't handle properly. This *was* one of them,
and it was fixed by commit e1f43b4 ("repair: update extent count
after zapping duplicate blocks") which was added to xfs_repair
v3.1.8.
IOWs, upgrading xfsprogs to the latest release and re-running
xfs_repair should fix this error.

OK. I'll scarf the source and compile. All I need is to git clone
git://oss.sgi.com/xfs/xfs and git://oss.sgi.com/xfs/cmds/xfsprogs, right?

I've never used git on a package maintained in my distro. Will I have
issues when I upgrade to Debian Jessie in a few months, since this is
not being managed by apt / dpkg? It looks like Jessie has 3.2.1 of
xfs-progs.

Dave Chinner

2014-09-10 03:33:31 UTC

Post by Leslie Rhorer

Post by Dave Chinner

Post by Leslie Rhorer

Post by Dave Chinner
Fristly, more infomration is required, namely versions and actual

RAID-Server:/# xfs_repair -V
xfs_repair version 3.1.7
RAID-Server:/# uname -r
3.2.0-4-amd64

Ok, so a relatively old xfs_repair. That's important - read on....

OK, a good reason is a good reason.

Post by Dave Chinner

Post by Leslie Rhorer
4.0 GHz FX-8350 eight core processor
RAID-Server:/# cat /proc/meminfo /proc/mounts /proc/partitions
MemTotal: 8099916 kB

....

Post by Leslie Rhorer
/dev/md0 /RAID xfs
rw,relatime,attr2,delaylog,sunit=2048,swidth=12288,noquota 0 0

FWIW, you don't need sunit=2048,swidth=12288 in the mount options -
they are stored on disk and the mount options are only necessray to
change the on-disk values.

They aren't. Those were created automatically, weather at creation
time or at mount time, I don't know, but the filesystem was created with

Ah, my mistake. Normally it's only mount options in that code - I
forgot that we report sunit/swidth unconditionally if it is set in
the superblock.

Post by Leslie Rhorer

Post by Dave Chinner

Post by Leslie Rhorer
I'm not sure what is meant by "write cache status" in this context.
The machine has been rebooted more than once during recovery and the
FS has been umounted and xfs_repair run several times.

http://xfs.org/index.php/XFS_FAQ#Q:_What_is_the_problem_with_the_write_cache_on_journaled_filesystems.3F

I knew that, but I still don't see the relevance in this context.
There is no battery backup on the drive controller or the drives,
and the drives have all been powered down and back up several times.
Anything in any cache right now would be from some operation in the
last few minutes, not four days ago.

There is no direct relevance to your situation, but for a lot of
other common problems it definitely is. That's why we ask people to
report it with all the other information about their system

Post by Leslie Rhorer

Post by Dave Chinner

Post by Leslie Rhorer
I don't know for what the acronym BBWC stands.

"battery backed write cache". If you're not using a hardware RAID
controller, it's unlikely you have one.

See my previous. I do have one (a 3Ware 9650E, given to me by a
friend when his company switched to zfs for their server). It's not
on this system. This array is on a HighPoint RocketRAID 2722.

Ok. We have seen over time that those 3ware controllers can do
strange things in error conditions - we've had reports of entire
hardware luns dying and being completely unrecoverable after a
disk was kicked out due to an error. I can't comment on the
highpoint controller - either not many people use them or they just
don't report problems if there do. Either way, I'd suggest that if
you aren't running the latest firmware it would be to update them
as these problems were typically fixed by newer firmware releases.

Post by Leslie Rhorer

Post by Dave Chinner

Post by Leslie Rhorer
[192173.364460] [<ffffffff810fe45a>] ? vfs_fstatat+0x32/0x60
[192173.364471] [<ffffffff810fe590>] ? sys_newstat+0x12/0x2b
[192173.364483] [<ffffffff813509f5>] ? page_fault+0x25/0x30
[192173.364495] [<ffffffff81355452>] ? system_call_fastpath+0x16/0x1b
[192173.364503] XFS (md0): Corruption detected. Unmount and run xfs_repair
That last line, by the way, is why I ran umount and xfs_repair.

Right, that's the correct thing to do, but sometimes there are
issues that repair doesn't handle properly. This *was* one of them,
and it was fixed by commit e1f43b4 ("repair: update extent count
after zapping duplicate blocks") which was added to xfs_repair
v3.1.8.
IOWs, upgrading xfsprogs to the latest release and re-running
xfs_repair should fix this error.

OK. I'll scarf the source and compile. All I need is to git clone
git://oss.sgi.com/xfs/xfs and git://oss.sgi.com/xfs/cmds/xfsprogs, right?

Just clone git://oss.sgi.com/xfs/cmds/xfsprogs and check out the
v3.2.1 tag and build that..

Post by Leslie Rhorer
I've never used git on a package maintained in my distro. Will I
have issues when I upgrade to Debian Jessie in a few months, since
this is not being managed by apt / dpkg? It looks like Jessie has
3.2.1 of xfs-progs.

If you're using debian you can build debian packages directly from
the git tree via "make deb" (I use it all the time for pushing
new builds to my test machines) and so when you upgrade to Jessie it
should just replace your custom built package correctly...

Cheers,

Dave.

--
Dave Chinner
***@fromorbit.com

Leslie Rhorer

2014-09-10 04:14:48 UTC

Post by Dave Chinner
There is no direct relevance to your situation, but for a lot of
other common problems it definitely is. That's why we ask people to
report it with all the other information about their system

Yeah, understood.

Post by Dave Chinner
Ok. We have seen over time that those 3ware controllers can do
strange things in error conditions - we've had reports of entire
hardware luns dying and being completely unrecoverable after a
disk was kicked out due to an error.

Oof. That's not good. It's stable right now. I'm considering a
different controller at some point. I may accelerate that process.

Post by Dave Chinner
I can't comment on the
highpoint controller - either not many people use them or they just
don't report problems if there do. Either way, I'd suggest that if
you aren't running the latest firmware it would be to update them
as these problems were typically fixed by newer firmware releases.

As a matter of fact, I was going to do just that. I have to reboot the
system in DOS (of all things), since they don't have a linux loader.
I've got to arrange a convenient time.

Post by Dave Chinner

Post by Leslie Rhorer
OK. I'll scarf the source and compile. All I need is to git clone
git://oss.sgi.com/xfs/xfs and git://oss.sgi.com/xfs/cmds/xfsprogs, right?

Just clone git://oss.sgi.com/xfs/cmds/xfsprogs and check out the
v3.2.1 tag and build that..

OK, I'm doing something wrong, I think. It's been over a decade since
I compiled a kernel. It makes me a little nervous.

Post by Dave Chinner

Post by Leslie Rhorer
I've never used git on a package maintained in my distro. Will I
have issues when I upgrade to Debian Jessie in a few months, since
this is not being managed by apt / dpkg? It looks like Jessie has
3.2.1 of xfs-progs.

If you're using debian you can build debian packages directly from
the git tree via "make deb" (I use it all the time for pushing

Um, is that make deb-pkg, perhaps? I'm not seeing a "deb" in the
package targets.

Post by Dave Chinner
new builds to my test machines) and so when you upgrade to Jessie it
should just replace your custom built package correctly...

`make deb` finds no install target. If I run `make menuconfig` it
complains about there being no ncurses. Libncurses5 is installed, and I
don't know what else I should get. `make oldconfig` seems to work. Am
I headed the right direction? There are quite a few configuration
targets, and I am not sure which one to choose. There are also a number
of questions asked by the oldconfig target (and presumably the same for
other config targets), and I'm unsure how to answer. I definitely don't
want to make an error and potentially wind up with an unbootable system.

Leslie Rhorer

2014-09-10 04:22:03 UTC

Post by Dave Chinner

Post by Leslie Rhorer
OK. I'll scarf the source and compile. All I need is to git clone
git://oss.sgi.com/xfs/xfs and git://oss.sgi.com/xfs/cmds/xfsprogs, right?

Just clone git://oss.sgi.com/xfs/cmds/xfsprogs and check out the
v3.2.1 tag and build that..

Oops! Hold on. I didn't read that closely enough. You were saying I
only need to compile xfs-progs. That's working.

Emmanuel Florac

2014-09-10 14:34:38 UTC

Le Tue, 09 Sep 2014 23:22:03 -0500

Post by Leslie Rhorer
Oops! Hold on. I didn't read that closely enough. You were
saying I only need to compile xfs-progs. That's working.

You don't need to install the resulting binaries either. xfs_repair
will happily run from the source directory, ./xfs_repair /dev/blah ...

--
------------------------------------------------------------------------
Emmanuel Florac | Direction technique
| Intellique
| <***@intellique.com>
| +33 1 78 94 84 02
------------------------------------------------------------------------

Leslie Rhorer

2014-09-10 04:51:42 UTC

Post by Dave Chinner

Post by Leslie Rhorer

Post by Dave Chinner

Post by Leslie Rhorer

Post by Dave Chinner
Fristly, more infomration is required, namely versions and actual

RAID-Server:/# xfs_repair -V
xfs_repair version 3.1.7
RAID-Server:/# uname -r
3.2.0-4-amd64

Ok, so a relatively old xfs_repair. That's important - read on....

OK, a good reason is a good reason.

Post by Dave Chinner

Post by Leslie Rhorer
4.0 GHz FX-8350 eight core processor
RAID-Server:/# cat /proc/meminfo /proc/mounts /proc/partitions
MemTotal: 8099916 kB

....

Post by Leslie Rhorer
/dev/md0 /RAID xfs
rw,relatime,attr2,delaylog,sunit=2048,swidth=12288,noquota 0 0

FWIW, you don't need sunit=2048,swidth=12288 in the mount options -
they are stored on disk and the mount options are only necessray to
change the on-disk values.

They aren't. Those were created automatically, weather at creation
time or at mount time, I don't know, but the filesystem was created with

Ah, my mistake. Normally it's only mount options in that code - I
forgot that we report sunit/swidth unconditionally if it is set in
the superblock.

Post by Leslie Rhorer

Post by Dave Chinner

Post by Leslie Rhorer
I'm not sure what is meant by "write cache status" in this context.
The machine has been rebooted more than once during recovery and the
FS has been umounted and xfs_repair run several times.

http://xfs.org/index.php/XFS_FAQ#Q:_What_is_the_problem_with_the_write_cache_on_journaled_filesystems.3F

I knew that, but I still don't see the relevance in this context.
There is no battery backup on the drive controller or the drives,
and the drives have all been powered down and back up several times.
Anything in any cache right now would be from some operation in the
last few minutes, not four days ago.

There is no direct relevance to your situation, but for a lot of
other common problems it definitely is. That's why we ask people to
report it with all the other information about their system

Post by Leslie Rhorer

Post by Dave Chinner

Post by Leslie Rhorer
I don't know for what the acronym BBWC stands.

"battery backed write cache". If you're not using a hardware RAID
controller, it's unlikely you have one.

See my previous. I do have one (a 3Ware 9650E, given to me by a
friend when his company switched to zfs for their server). It's not
on this system. This array is on a HighPoint RocketRAID 2722.

Ok. We have seen over time that those 3ware controllers can do
strange things in error conditions - we've had reports of entire
hardware luns dying and being completely unrecoverable after a
disk was kicked out due to an error. I can't comment on the
highpoint controller - either not many people use them or they just
don't report problems if there do. Either way, I'd suggest that if
you aren't running the latest firmware it would be to update them
as these problems were typically fixed by newer firmware releases.

Post by Leslie Rhorer

Post by Dave Chinner

Post by Leslie Rhorer
[192173.364460] [<ffffffff810fe45a>] ? vfs_fstatat+0x32/0x60
[192173.364471] [<ffffffff810fe590>] ? sys_newstat+0x12/0x2b
[192173.364483] [<ffffffff813509f5>] ? page_fault+0x25/0x30
[192173.364495] [<ffffffff81355452>] ? system_call_fastpath+0x16/0x1b
[192173.364503] XFS (md0): Corruption detected. Unmount and run xfs_repair
That last line, by the way, is why I ran umount and xfs_repair.

Right, that's the correct thing to do, but sometimes there are
issues that repair doesn't handle properly. This *was* one of them,
and it was fixed by commit e1f43b4 ("repair: update extent count
after zapping duplicate blocks") which was added to xfs_repair
v3.1.8.
IOWs, upgrading xfsprogs to the latest release and re-running
xfs_repair should fix this error.

OK. I'll scarf the source and compile. All I need is to git clone
git://oss.sgi.com/xfs/xfs and git://oss.sgi.com/xfs/cmds/xfsprogs, right?

Just clone git://oss.sgi.com/xfs/cmds/xfsprogs and check out the
v3.2.1 tag and build that..

Post by Leslie Rhorer
I've never used git on a package maintained in my distro. Will I
have issues when I upgrade to Debian Jessie in a few months, since
this is not being managed by apt / dpkg? It looks like Jessie has
3.2.1 of xfs-progs.

If you're using debian you can build debian packages directly from
the git tree via "make deb" (I use it all the time for pushing
new builds to my test machines) and so when you upgrade to Jessie it
should just replace your custom built package correctly...
Cheers,
Dave.

Thanks a ton, Dave (and everyone else who helped). That seems to have
worked just fine. The three grunged entries are gone and the system is
happily copying over the backups. Now I'll run another rsync with
checksum to make sure everything is good before putting the backup into
production. I'm also going to upgrade the controller BIOS just in case.

Dave Chinner

2014-09-10 05:23:03 UTC

Post by Leslie Rhorer

Post by Dave Chinner

Post by Leslie Rhorer
I've never used git on a package maintained in my distro. Will I
have issues when I upgrade to Debian Jessie in a few months, since
this is not being managed by apt / dpkg? It looks like Jessie has
3.2.1 of xfs-progs.

If you're using debian you can build debian packages directly from
the git tree via "make deb" (I use it all the time for pushing
new builds to my test machines) and so when you upgrade to Jessie it
should just replace your custom built package correctly...

Thanks a ton, Dave (and everyone else who helped). That seems to
have worked just fine. The three grunged entries are gone and the
system is happily copying over the backups. Now I'll run another
rsync with checksum to make sure everything is good before putting
the backup into production. I'm also going to upgrade the
controller BIOS just in case.

Good to hear. Hopefully everything will check out. Just yell if you
need more help. ;)

Cheers,

Dave.

--
Dave Chinner
***@fromorbit.com

Leslie Rhorer

2014-09-11 05:47:31 UTC

Post by Dave Chinner

Post by Leslie Rhorer

Post by Dave Chinner

Post by Leslie Rhorer
I've never used git on a package maintained in my distro. Will I
have issues when I upgrade to Debian Jessie in a few months, since
this is not being managed by apt / dpkg? It looks like Jessie has
3.2.1 of xfs-progs.

If you're using debian you can build debian packages directly from
the git tree via "make deb" (I use it all the time for pushing
new builds to my test machines) and so when you upgrade to Jessie it
should just replace your custom built package correctly...

Thanks a ton, Dave (and everyone else who helped). That seems to
have worked just fine. The three grunged entries are gone and the
system is happily copying over the backups. Now I'll run another
rsync with checksum to make sure everything is good before putting
the backup into production. I'm also going to upgrade the
controller BIOS just in case.

Good to hear. Hopefully everything will check out. Just yell if you
need more help. ;)

Thanks. The rsync compare just finished on the non-volatile areas of
the file system without a single mismatch and no missing files. That's
good enough for me.

34 Replies
13 Views
Permalink to this page
Disable enhanced parsing

Thread Navigation

Leslie Rhorer 2014-09-09 15:21:37 UTC

Sean Caron 2014-09-09 15:50:28 UTC

Sean Caron 2014-09-09 16:03:56 UTC

Eric Sandeen 2014-09-09 22:24:55 UTC

Sean Caron 2014-09-09 22:57:06 UTC

Roger Willcocks 2014-09-10 01:00:18 UTC

Leslie Rhorer 2014-09-10 01:23:51 UTC

Eric Sandeen 2014-09-10 05:09:08 UTC

Leslie Rhorer 2014-09-10 00:48:41 UTC

Roger Willcocks 2014-09-10 01:10:47 UTC

Leslie Rhorer 2014-09-10 01:31:07 UTC

Emmanuel Florac 2014-09-10 14:24:11 UTC

Sean Caron 2014-09-10 14:49:32 UTC

Emmanuel Florac 2014-09-09 16:08:04 UTC

Dave Chinner 2014-09-09 22:06:45 UTC

Leslie Rhorer 2014-09-10 01:12:38 UTC

Sean Caron 2014-09-10 01:25:37 UTC

Leslie Rhorer 2014-09-10 01:43:08 UTC

Emmanuel Florac 2014-09-10 14:31:24 UTC

Grozdan 2014-09-10 14:52:27 UTC

Emmanuel Florac 2014-09-10 15:12:08 UTC

Grozdan 2014-09-10 15:32:17 UTC

Sean Caron 2014-09-10 14:54:33 UTC

Leslie Rhorer 2014-09-10 23:18:38 UTC

Greg Freemyer 2014-09-11 13:24:04 UTC

Emmanuel Florac 2014-09-12 07:06:42 UTC

Dave Chinner 2014-09-10 01:53:31 UTC

Leslie Rhorer 2014-09-10 03:10:45 UTC

Dave Chinner 2014-09-10 03:33:31 UTC

Leslie Rhorer 2014-09-10 04:14:48 UTC

Leslie Rhorer 2014-09-10 04:22:03 UTC

Emmanuel Florac 2014-09-10 14:34:38 UTC

Leslie Rhorer 2014-09-10 04:51:42 UTC

Dave Chinner 2014-09-10 05:23:03 UTC

Leslie Rhorer 2014-09-11 05:47:31 UTC

about - legalese

Loading...