homegrown NAS rebuild

my NAS OS drive spit the bit and i rebuilt the OS on a new disk, and now i need to get all the data disks online. the data disks are a software RAID 5, using mdadm, with LVM and separate mountpoints for each of the LVs. i have the array created, /dev/md0, as well as the PV and VG. i now need to find the LVs and get them recognized and mounted. it has been a while since i did all of this and i am a bit rusty. also, my notes on how to do all of this are on one of the LVs. does anyone have some quick tips on how to discover the LVs within the VG, and get them mounted? i am pretty close, but dont have these last few pieces put together.
 
using the short versions, for brevity. the PV and VG are there, but the LVs are not showing up. scanning or disabling, exporting, importing and enabling all did nothing.

pvs:
Code:
  PV         VG            Fmt  Attr PSize    PFree   
  /dev/md0   vg_nas_export lvm2 a--    <8.19t   <8.19t
  /dev/sde4  vg_nas        lvm2 a--  <463.76g <436.76g
vgs:
Code:
  VG            #PV #LV #SN Attr   VSize    VFree   
  vg_nas          1   6   0 wz--n- <463.76g <436.76g
  vg_nas_export   1   0   0 wz--n-   <8.19t   <8.19t
lvs:
Code:
  LV               VG     Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  lv_root          vg_nas -wi-ao---- 15.00g                                                    
  lv_swap          vg_nas -wi-ao----  8.00g                                                    
  lv_var_lib_iscsi vg_nas -wi-ao----  1.00g                                                    
  lv_var_lib_nfs   vg_nas -wi-ao----  1.00g                                                    
  lv_var_lib_samba vg_nas -wi-ao----  1.00g                                                    
  lv_var_log       vg_nas -wi-ao----  1.00g
i tried mounting /dev/md0 to /mnt, and it tells me that its an unknown filesystem, which stands to reason, but makes me think that i am missing something about the RAID array being an LVM or something else, and i am missing a step to get things recognized...
Code:
[root@nas ~]# mount /dev/md0 /mnt
mount: /mnt: unknown filesystem type 'LVM2_member'.
       dmesg(1) may have more information after failed mount system call.
of course, there is nothing in dmesg.
 
for giggles, i tried to create one of the "missing" LVs, testing things first...
Code:
[root@nas ~]# lvcreate -t -L 1T vg_nas_export -n lv_movies
  TEST MODE: Metadata will NOT be updated and volumes will not be (de)activated.
  Logical volume "lv_movies" created.
then, i tried for real...
Code:
[root@nas ~]# lvcreate -L 1T vg_nas_export -n lv_movies
WARNING: xfs signature detected on /dev/vg_nas_export/lv_movies at offset 0. Wipe it? [y/n]: n
  Aborted wiping of xfs.
  1 existing signature left on the device.
  Failed to wipe signatures on logical volume vg_nas_export/lv_movies.
  Aborting. Failed to wipe start of new LV.
the filesystem(s) are there, as is the data it seems, but why doesn't the system identify them with lvscan/lvs/etc?
 
mdstat:
Code:
[root@nas ~]# cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] 
md0 : active raid5 sda1[0] sdc1[2] sdd1[4] sdb1[1]
      8790398976 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/22 pages [0KB], 65536KB chunk

unused devices: <none>
pvdisplay:
Code:
[root@nas ~]# pvdisplay
  --- Physical volume ---
  PV Name               /dev/sde4
  VG Name               vg_nas
  PV Size               <463.76 GiB / not usable 2.00 MiB
  Allocatable           yes 
  PE Size               4.00 MiB
  Total PE              118722
  Free PE               111810
  Allocated PE          6912
  PV UUID               F47qcD-I3Hs-RRXh-3bER-r9e7-Xo96-342d9d
   
  --- Physical volume ---
  PV Name               /dev/md0
  VG Name               vg_nas_export
  PV Size               <8.19 TiB / not usable 2.00 MiB
  Allocatable           yes 
  PE Size               4.00 MiB
  Total PE              2146093
  Free PE               2146093
  Allocated PE          0
  PV UUID               stCm4i-7Xe6-xKfA-udTp-avxw-KCj9-IAAPCd
vgdisplay:
Code:
[root@nas ~]# vgdisplay
  --- Volume group ---
  VG Name               vg_nas
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  7
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                6
  Open LV               6
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               <463.76 GiB
  PE Size               4.00 MiB
  Total PE              118722
  Alloc PE / Size       6912 / 27.00 GiB
  Free  PE / Size       111810 / <436.76 GiB
  VG UUID               UqILMh-8YQ3-DYyE-RKRs-5Jsy-4i5a-yFIcJR
   
  --- Volume group ---
  VG Name               vg_nas_export
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  11
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                0
  Open LV               0
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               <8.19 TiB
  PE Size               4.00 MiB
  Total PE              2146093
  Alloc PE / Size       0 / 0   
  Free  PE / Size       2146093 / <8.19 TiB
  VG UUID               qJLa1u-JmnG-t1yk-qLif-Fmy0-2NJV-n4kn2T
 
wierdly, in /etc/lvm/archive there are files with contents like this:
Code:
[root@nas archive]# cat vg_nas_export_00010-1535023889.vg 
# Generated by LVM2 version 2.03.23(2) (2023-11-21): Fri May  3 16:13:24 2024

contents = "Text Format Volume Group"
version = 1

description = "Created *before* executing 'lvcreate -L 1T vg_nas_export -n lv_movies'"

creation_host = "nas"    # Linux nas 6.8.7-300.fc40.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Apr 17 19:21:08 UTC 2024 x86_64
creation_time = 1714767204    # Fri May  3 16:13:24 2024

vg_nas_export {
    id = "qJLa1u-JmnG-t1yk-qLif-Fmy0-2NJV-n4kn2T"
    seqno = 10
    format = "lvm2"            # informational
    status = ["RESIZEABLE", "READ", "WRITE"]
    flags = []
    extent_size = 8192        # 4 Megabytes
    max_lv = 0
    max_pv = 0
    metadata_copies = 0

    physical_volumes {

        pv0 {
            id = "stCm4i-7Xe6-xKfA-udTp-avxw-KCj9-IAAPCd"
            device = "/dev/md0"    # Hint only

            device_id_type = "md_uuid"
            device_id = "a009f443-eec0-57b7-1cfc-96d560d83f40"
            status = ["ALLOCATABLE"]
            flags = []
            dev_size = 17580797952    # 8.1867 Terabytes
            pe_start = 3072
            pe_count = 2146093    # 8.1867 Terabytes
        }
    }

    logical_volumes {

        lv_movies {
            id = "8NrM2U-pOiD-hMsM-HHfl-Rred-YZT4-RNCXgU"
            status = ["READ", "WRITE", "VISIBLE"]
            flags = []
            creation_time = 1714767196    # 2024-05-03 16:13:16 -0400
            creation_host = "nas"
            segment_count = 1

            segment1 {
                start_extent = 0
                extent_count = 262144    # 1024 Gigabytes

                type = "striped"
                stripe_count = 1    # linear

                stripes = [
                    "pv0", 0
                ]
            }
        }
    }

}
i did not run the command that is listed... this seems like some kind of enumeration attempt that did not succeed in creating the LVs
 
description = "Created before executing 'lvcreate -L 1T vg_nas_export -n lv_movies'"


i did not run the command that is listed... this seems like some kind of enumeration attempt that did not succeed in creating the LVs
But you did on friday no in this post? this is super, super weird. Must admit I'm at a loss but I will have a poke around the LVM stuff I have here and have a think
 
i've been away for the week, in training, and have not been near my gear. picking back up on this...

you are correct, i did run that and i must have misconstrued the dates or times. i jumped 4 releases from fedora 36 to fedora 40 when i did the rebuild with the new SSD, so i am going to try rebuilding on f36 and see if that helps. maybe there are some backwards compatibility issues between the versions, and the original OS version may get me to my data. it will be a bit before i have everything ready for the f36 rebuild but will report back with any progress.
 
ugg... the rig is built with fedora 36, but i cant get console because of some video issue. a previous thread lead me to think nomodeset could help, but alas, it seems there is some other issue with 36. 40 actually works out of the box. i need to rebuild on 40 for console, and configure networking etc. then i can get on it and futz around. testdisk seems like it might help recover the partitions. the sad thing is that testdisk does not work with XFS, as it says Support for this filesystem hasn't been implemented.. i am hoping that i can recover the partitions and the filesystems will be intact. otherwise i need to find other recovery options.
 
its been a while since i really did anything with the NAS, as i have grown disenfranchised with the whole lot of it. its the second time my OS disk when belly up and took my data with it. the data being on separate spinning rust does not seem sufficient to isolate things when an SSD goes FUBAR. that has me thinking about why.

there is another thread about inodes and i am wondering if there is a correlation between my OS SSD going bad and my data being clobbered. i installed the testdisk package and it can see the filesystems. photorec can see the data and i need to get a separate disk to put the recovered files onto, as a recovery effort. but how do i avoid having the OS disk destroy the records of where the data is located on other filesystems?

it seems there should be a way to properly isolate things so that one bad disk does not affect the ability to keep track of data on another disk/array/filesystem.

of course, if i had proper backups this would be moot, but here i am... :(
 
ooi what's the content of /dev/mapper and the listing of /dev/dm*?
Code:
[root@nas ~]# ll /dev/mapper/
total 0
crw------- 1 root root 10, 236 May 31 14:10 control
lrwxrwxrwx 1 root root       7 May 31 13:34 vg_nas-lv_root -> ../dm-0
lrwxrwxrwx 1 root root       7 May 31 13:34 vg_nas-lv_swap -> ../dm-1
lrwxrwxrwx 1 root root       7 May 31 13:34 vg_nas-lv_var_lib_iscsi -> ../dm-5
lrwxrwxrwx 1 root root       7 May 31 13:34 vg_nas-lv_var_lib_nfs -> ../dm-4
lrwxrwxrwx 1 root root       7 May 31 13:34 vg_nas-lv_var_lib_samba -> ../dm-3
lrwxrwxrwx 1 root root       7 May 31 13:34 vg_nas-lv_var_log -> ../dm-2
and
Code:
[root@nas ~]# ll /dev/dm*
brw-rw---- 1 root disk 253, 0 May 31 13:34 /dev/dm-0
brw-rw---- 1 root disk 253, 1 May 31 13:34 /dev/dm-1
brw-rw---- 1 root disk 253, 2 May 31 13:34 /dev/dm-2
brw-rw---- 1 root disk 253, 3 May 31 13:34 /dev/dm-3
brw-rw---- 1 root disk 253, 4 May 31 13:34 /dev/dm-4
brw-rw---- 1 root disk 253, 5 May 31 13:34 /dev/dm-5
 
having a re-read of the thread to refresh myself. I think the /etc/lvm/archive stuff might be as a result of you doing the dry-run of the create/and or you said no. The thing that I find odd now on the second reading is

Code:
  --- Volume group ---
  VG Name               vg_nas_export
...
  Free  PE / Size       2146093 / <8.19 TiB
...
which to be fair to you is also in the vgs output.

What does pvs -v --segments say? also lvs -o vg_all
 
Code:
[root@nas ~]# pvs -v --segments
  PV         VG            Fmt  Attr PSize    PFree    Start SSize   LV               Start Type   PE Ranges         
  /dev/md0   vg_nas_export lvm2 a--    <8.19t   <8.19t     0 2146093                      0 free                     
  /dev/sde5  vg_nas        lvm2 a--  <463.76g <436.76g     0    2048 lv_swap              0 linear /dev/sde5:0-2047   
  /dev/sde5  vg_nas        lvm2 a--  <463.76g <436.76g  2048     256 lv_var_log           0 linear /dev/sde5:2048-2303
  /dev/sde5  vg_nas        lvm2 a--  <463.76g <436.76g  2304     256 lv_var_lib_samba     0 linear /dev/sde5:2304-2559
  /dev/sde5  vg_nas        lvm2 a--  <463.76g <436.76g  2560     256 lv_var_lib_nfs       0 linear /dev/sde5:2560-2815
  /dev/sde5  vg_nas        lvm2 a--  <463.76g <436.76g  2816     256 lv_var_lib_iscsi     0 linear /dev/sde5:2816-3071
  /dev/sde5  vg_nas        lvm2 a--  <463.76g <436.76g  3072    3840 lv_root              0 linear /dev/sde5:3072-6911
  /dev/sde5  vg_nas        lvm2 a--  <463.76g <436.76g  6912  111810                      0 free
and
Code:
[root@nas ~]# lvs -o vg_all
  Fmt  VG UUID                                VG     Attr   VPerms     Extendable Exported   Partial    AllocPol   Clustered  Shared  VSize    VFree    SYS ID System ID LockType VLockArgs Ext   #Ext   Free   MaxLV MaxPV #PV #PV Missing #LV #SN Seq VG Tags VProfile #VMda #VMdaUse VMdaFree  VMdaSize  #VMdaCps
  lvm2 rnzkM9-vYI1-Nim1-C914-WWM8-oe2n-g3iduD vg_nas wz--n- writeable  extendable                       normal                        <463.76g <436.76g                                     4.00m 118722 111810     0     0   1           0   6   0   7                      1        1   506.50k  1020.00k unmanaged
  lvm2 rnzkM9-vYI1-Nim1-C914-WWM8-oe2n-g3iduD vg_nas wz--n- writeable  extendable                       normal                        <463.76g <436.76g                                     4.00m 118722 111810     0     0   1           0   6   0   7                      1        1   506.50k  1020.00k unmanaged
  lvm2 rnzkM9-vYI1-Nim1-C914-WWM8-oe2n-g3iduD vg_nas wz--n- writeable  extendable                       normal                        <463.76g <436.76g                                     4.00m 118722 111810     0     0   1           0   6   0   7                      1        1   506.50k  1020.00k unmanaged
  lvm2 rnzkM9-vYI1-Nim1-C914-WWM8-oe2n-g3iduD vg_nas wz--n- writeable  extendable                       normal                        <463.76g <436.76g                                     4.00m 118722 111810     0     0   1           0   6   0   7                      1        1   506.50k  1020.00k unmanaged
  lvm2 rnzkM9-vYI1-Nim1-C914-WWM8-oe2n-g3iduD vg_nas wz--n- writeable  extendable                       normal                        <463.76g <436.76g                                     4.00m 118722 111810     0     0   1           0   6   0   7                      1        1   506.50k  1020.00k unmanaged
  lvm2 rnzkM9-vYI1-Nim1-C914-WWM8-oe2n-g3iduD vg_nas wz--n- writeable  extendable                       normal                        <463.76g <436.76g                                     4.00m 118722 111810     0     0   1           0   6   0   7                      1        1   506.50k  1020.00k unmanaged
 
testdisk might just see the XFS filesystems, and assuming the original LV's weren't messed with much(i.e. alloc'd but never resized) they might all just be linear allocations. In which case you might be able to mount them with some block offset shenanigans and loopback devices.

But yes, the LV's being gone is a bit weird. If an extra PV was involved in the VG and LV metadata landed on that somehow I'd have thought LVM would complain bitterly about it. The only other theory I can come up with is that grub(or something) wiped/overwrote the LV metadata blocks when the OS was installed but left enough of the PV/VG blocks in place for that to be detectable.

it seems there should be a way to properly isolate things so that one bad disk does not affect the ability to keep track of data on another disk/array/filesystem.
Properly configured LVM should behave this way. Otherwise, getting rid of abstractions is the only way to isolate yourself from them breaking. In this case, just using the raw MD device without LVM.
 
In which case you might be able to mount them with some block offset shenanigans and loopback devices
yeah this is where my thoughts were leading. Looking at a random box here with some lvm stuff on it it looks like the contents of /etc/lvm/backup were created at the last boot, I wonder if you have them and if they look sane(ish) then maybe vgcfgrestore is your friend... but once you're getting down this road you want to image that metadevice first...
 
testdisk can see the LVM partitions but cant do anything with them. some limitation in dealing with XFS, i think. photorec can read the files. my thinking is to recover everything i can with photorec onto an external drive and pave over things. imaging the array first may also be something i do before wiping.

when i built the NAS, i physically removed the disks because the fedora installer would not allow me to install to /dev/sde, since the SSD is connected to the "cdrom" SATA header. its enumerated last in the order and the 4 HDDs were the only option to install to. there should be no anaconda/installer impact to the disks.

i guess i dont know what "properly configured LVM" is, since a buggered SSD has twice fowled up my data disks. it seems that the inode tables are corrupted because of the failing SSD.
 
Unless I've missed a memo, testdisk doesn't understand LVM. Looks like I did miss a memo and it has some LVM support.

Are the partitions it finds marked as LVM or XFS? If XFS then you might be able to use the start/end positions with losetup(read only mode, or work on a clone) to try mount the partitions as loopback devices.
 
Last edited:
  • Like
Reactions: wobblytickle