AlmaLinux / CentOS / Red Hat Enterprise – RAID1 GRUB recovery on EFI systems

So i tried to recover RAID1 and GRUB on EFI enabled Linux (AlmaLinux) system, and I ran into few unexpected issues. Even chatGPT and google couldn’t answer this, and I was surprised i couldn’t find any answers on google since I am sure many of us use RAID1 on Linux on EFI enabled systems.

I have 2 nvme 4TB samsung SSDs. They are mapped as /dev/nvme0n1 and /dev/nvme1n1 on the systems and are in RAID1 configuration. I am using Linux software “md” raid. So when the first SSD died i was left with working system on second SSD, which usually works without issues you are just left without RAID redundancy protection. I would reboot to confirm it all still works. I booted from EFI enabled BIOS and it worked, the entries for UEFI OS were already being seen in BIOS. I waited a few days for replacement SSD to arrive and then proceeded to install new SSD and went with my standard RAID1 recovery procedure I previously used on all SATA SSD / NVME SSDs and HDD non efi systems

first i copy existing partition table from working drive ( /dev/nvme1n1)

sgdisk -R /dev/nvme0n1 /dev/nvme1n1

Where nvme1n1 is the source and nvme0n1 is the destination.

After this I re-add those partition to my raid for example /dev/md124 which is my /boot/efi partition

mdadm /dev/md124 -a /dev/nvme0n1p5

After raid rebuild (resync) i gtt


mdadm --detail /dev/md124


/dev/md124:
Version : 1.0
Creation Time : Sat May 18 21:12:23 2024
Raid Level : raid1
Array Size : 205760 (200.94 MiB 210.70 MB)
Used Dev Size : 205760 (200.94 MiB 210.70 MB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent

 Intent Bitmap : Internal

   Update Time : Sat Dec 14 23:21:25 2024
         State : clean 
Active Devices : 2

Working Devices : 2
Failed Devices : 0
Spare Devices : 0

Consistency Policy : bitmap

Name : srv-new:boot_efi (local to host srv-new)
UUID : 0cf06dac:1ab929b9:76e17bb3:63233042
Events : 88


Number Major Minor RaidDevice State
2 259 5 0 active sync /dev/nvme0n1p4
1 259 10 1 active sync /dev/nvme1n1p4

So i went on to install EFI GRUB entry via efibootmgr

efibootmgr --create --disk /dev/nvme0n1 --part 4 --label "UEFI OS" --loader '\EFI\BOOT\BOOTX64.EFI'

but whatever i did i always got only ONE entry

efibootmgr -v


BootCurrent: 0000
Timeout: 1 seconds
BootOrder: 0000,0004
Boot0000* UEFI OS HD(4,GPT,bac8b516-1107-4558-a567-a60ad674c7f8,0x14231000,0x64800)/File(\EFI\BOOT\BOOTX64.EFI)

After a lot of googling and chatGPT having no clue .. I confirmed in BIOS that the system would only see partition on DISK2 and wouldn’t boot of DISK1

Then i realized that my PARTUUID was copied from the destination disk to /dev/nvme0n1 and it supposed to be UNIQUE

lsblk -o NAME,PARTUUID

NAME PARTUUID
nvme0n1
├─nvme0n1p1 00056612-3823-4c89-8333-16b800929e95
│ └─md127
├─nvme0n1p2 e485e4b7-5110-4928-a285-205779b9f2df
│ └─md125
├─nvme0n1p3 aa9817cd-6421-4ea6-8ecb-6d422eb71515
│ └─md126
├─nvme0n1p4 e6b33e11-a0e9-4681-8ec8-653e30a65ffa
│ └─md124
└─nvme0n1p5 74fc09a2-6672-4c28-8866-f2a5e1ce3104
└─md123
nvme1n1
├─nvme1n1p1 00056612-3823-4c89-8333-16b800929e95
│ └─md127
├─nvme1n1p2 e485e4b7-5110-4928-a285-205779b9f2df
│ └─md125
├─nvme1n1p3 aa9817cd-6421-4ea6-8ecb-6d422eb71515
│ └─md126
├─nvme1n1p4 e6b33e11-a0e9-4681-8ec8-653e30a65ffa
│ └─md124
└─nvme1n1p5 74fc09a2-6672-4c28-8866-f2a5e1ce3104
└─md123

So i need to create new GUID/PARTUUID for separate partitions on /dev/nvme0n1

for that i used gdisk

gdisk /dev/nvme0n1

go to ‘x’ – expert mode option and then clock ‘c’ and ‘R’ – to randomize new GUID for that partiton.

After that repeat for every partition on /dev/nvme0n1

Type ‘w’ to write changes to disk

After existing gdisk, issue the command

partprobe /dev/nvme0n1

To re-read the partition table

After that i confirmed with

lsblk -o NAME,PARTUUID

that PARTUUID are now unique on both disks

NAME PARTUUID
nvme0n1
├─nvme0n1p1 f391024d-b6f8-4036-a004-2ed55fd0d7f7
│ └─md127
├─nvme0n1p2 a97744a6-aeb5-4d12-9c50-6da437e10dce
│ └─md125
├─nvme0n1p3 6aba1d11-17ce-4860-8d8f-f4a6f4b97c89
│ └─md126
├─nvme0n1p4 bac8b516-1107-4558-a567-a60ad674c7f8
│ └─md124
└─nvme0n1p5 fd4e76aa-968c-448d-bc58-8371aef44a5f
└─md123
nvme1n1
├─nvme1n1p1 00056612-3823-4c89-8333-16b800929e95
│ └─md127
├─nvme1n1p2 e485e4b7-5110-4928-a285-205779b9f2df
│ └─md125
├─nvme1n1p3 aa9817cd-6421-4ea6-8ecb-6d422eb71515
│ └─md126
├─nvme1n1p4 e6b33e11-a0e9-4681-8ec8-653e30a65ffa
│ └─md124
└─nvme1n1p5 74fc09a2-6672-4c28-8866-f2a5e1ce3104
└─md123

Now after i installed GRUB EFI entry as


efibootmgr --create --disk /dev/nvme0n1 --part 4 --label "UEFI OS" --loader '\EFI\BOOT\BOOTX64.EFI'

i finally got two entries, one for GRUB loader on /dev/nvme0n1 SSD (first disk) and one for /dev/nvme1n1 on second disk

BootCurrent: 0000
Timeout: 1 seconds
BootOrder: 0000,0004
Boot0000* UEFI OS HD(4,GPT,bac8b516-1107-4558-a567-a60ad674c7f8,0x14231000,0x64800)/File(\EFI\BOOT\BOOTX64.EFI)
Boot0004* UEFI OS HD(4,GPT,e6b33e11-a0e9-4681-8ec8-653e30a65ffa,0x14231000,0x64800)/File(\EFI\BOOT\BOOTX64.EFI)..BO

I verified that BIOS now see both entries and tried booting from both disks and it works now

So keep in mind that you cannot simply copy partition table from existing working RAID drive when you are recreating the partition table. You either create all partitions manually to match working existing RAID drive, and in this case PARTUUID/GUID will already be unique, or use sgdisk and partition table copy method, but then dont forget to recreate unique GUID / PARTUUID for that partition to be unique so that efibootmgr could see two separate disks and create entries for separate disks which is very important on RAID1 systems.

Hope this helps someone

Leave a Reply

Your email address will not be published. Required fields are marked *