Most part of your question is covered in Magisk Documentation. I will quote one of my previous answers to a different question, with some unnecessary details :)
PREREQUISITES:
To have a comprehensive understanding of how Magisk works, one must have basic understanding of:
- Discretionary Access Control (DAC)
- User identifiers (
[ESR]UID
), set-user-ID
- Linux Capabilities (process and file) which provide a fine-grained control over superuser permissions
- Mandatory Access Control (MAC)
- SELinux on Android
- Mount namespaces, Android's usage of namespaces for Storage Permissions
- Bind mount
- Android boot process, partitions and filesystems
- Android
init
services (the very first process started by kernel)
- *.rc files
- Structure of
boot
partition (kernel + DTB + ramdisk), Device Tree Blobs, DM-Verity (Android Verified Boot), Full Disk Encryption / File Based Encryption (FDE/FBE) etc.
WHAT IS ROOT?
Gaining root privileges means to run a process (usually shell) with UID zero (0) and all of the Linux capabilities so that the privileged process can bypass all kernel permission checks.
Superuser privileges are gained usually by executing a binary which has either:
This is how su
and sudo
work on Linux in traditional UNIX DAC. Non-privileged users execute these binaries to get root rights.
This is the less common method used.
In both cases the calling process must have all capabilities in its Bounding Set (one of the 5 capabilities categories a process can have) to have real root privileges.
HOW ANDROID RESTRICTS ROOT ACCESS?
Up to Android 4.3, one could simply execute a set-user-ID-root
su
binary to elevate its permissions to root user. However there were a number of Security Enhancements in Android 4.3 which broke this behavior:
- Android switched to file capabilities instead of relying on
set-user-ID
type of security vulnerabilities. A more secure mechanism: Ambient capabilities has also been introduced in Android Oreo.
- System daemons and services can make use of file capabilities to gain process capabilities (see under Transformation of capabilities during execve) but apps can't do that either because application code is executed by
zygote
with process control attribute NO_NEW_PRIVS
, ignoring set-user-ID
as well as file capabilities. SUID is also ignored by mounting /system
and /data
with nosuid
option for all apps.
- UID can be switched only if calling process has SETUID/SETGID capability in its Bounding set. But Android apps are made to run with all capabilities already dropped in all sets using process control attribute
CAPBSET_DROP
.
- Starting with Oreo, apps' ability to change UID/GID has been further suppressed by blocking certain syscalls using seccomp filters.
Since the standalone su
binaries stopped working with the release of Jelly Bean, a transition was made to su daemon mode. This daemon is launched during boot which handles all superuser requests made by applications when they execute the special su
binary (1). install-recovery.sh
(located under /system/bin/
or /system/etc/
) which is executed by a pre-installed init service flash_recovery
(useless for adventurers; updates recovery after an OTA installation) was used to launch this SU daemon on boot.
The next major challenge was faced when SELinux was set strictly enforcing
with the release of Android 5.0. flash_recovery service was added to a restricted SELinux context: u:r:install_recovery:s0
which stopped the unadulterated access to system. Even the UID 0 was bound to perform a very limited set of tasks on device. So the only viable option was to start a new service with unrestricted SUPER CONTEXT by patching the SELinux policy. That's what was done (temporarily for Lollipop (2, 3) and then permanently for Marshmallow) and that's what Magisk does.
HOW MAGISK WORKS?
Flashing Magisk usually requires a device with unlocked bootloader so that boot.img
could be dynamically modified from custom recovery (4) or a pre-modified boot.img
(5) could be flashed/booted e.g. from fastboot
.
As a side note, it's possible to start Magisk on a running ROM if you somehow get root privileges using some exploit in OS (6). However most of such security vulnerabilities have been fixed over time (7).
Also due to some vulnerabilities at SoC level (such as Qualcomm's EDL mode), locked bootloader can be hacked to load modified boot / recovery image breaking the Chain of Trust. However these are only exceptions.
Once the device boots from patched boot.img
, a fully privileged Magisk daemon (with UID: 0, full capabilities and unrestricted SELinux context) runs from the very start of booting process. When an app needs root access, it executes Magisk's (/sbin/)su
binary (worldly accessible by DAC and MAC) which doesn't change UID/GID on its own, but just connects to the daemon through a UNIX socket (8) and asks to provide the requesting app a root shell with all capabilities. In order to interact with user to grant/deny su
requests from apps, the daemon is hooked with the Magisk Manager
app that can display user interface prompts. A database (/data/adb/magisk.db
) of granted/denied permissions is built by the daemon for future use.
Booting Process:
Android kernel starts init
with SELinux in permissive
mode on boot (with a few exceptions). init
loads /sepolicy
(or split policy) before starting any services/daemons/processes, sets it enforcing
and then switches to its own context. From here afterwards, even init
isn't allowed by policy to revert back to permissive mode (9, 10). Neither the policy can be modified even by root user (11). Therefore Magisk replaces /init
file with a custom init
which patches the SELinux policy rules with SUPER CONTEXT (u:r:magisk:s0
) and defines the service to launch Magisk daemon with this context. Then the original init
is executed to continue booting process (12).
Systemless Working:
Since the init
file is built in boot.img
, modifying it is unavoidable and /system
modification becomes unnecessary. That's where the systemless
term was coined (13, 14). Main concern was to make OTAs easier - re-flashing the boot
image (and recovery) is less hassle than re-flashing system
. Block-Based OTA on a modified /system
partition will fail because it enables the use of dm-verity
to cryptographically sign the system
partition.
System-as-root:
On newer devices using system-as-root kernel doesn't load ramdisk
from boot
but from system
. So [system.img]/init
needs to be replaced with Magisk's init
. Also Magisk modifies /init.rc
and places its own files in /root
and /sbin
. It means system.img
is to be modified, but Magisk's approach is not to touch system
partition.
On A/B
devices during normal boot skip_initramfs
option is passed from bootloader in kernel cmdline as boot.img
contains ramdisk
for recovery. So Magisk patches kernel binary to always ignore skip_initramfs
i.e. boot in recovery, and places Magisk init
binary in recovery ramdisk
inside boot.img
. On boot when kernel boots to recovery, if there's no skip_initramfs
i.e. user intentionally booted to recovery, then Magisk init
simply executes recovery init
. Otherwise system.img
is mounted at /system_root
by Magisk init
, contents of ramdisk
are then copied to /
cleaning everything previously existing, files are added/modified in rootfs /
, /system_root/system
is bind-mounted to /system
, and finally [/system]/init
is executed (15, 16).
However things have again changed with Q, now /system
is mounted at /
but the files to be added/modified like /init
, /init.rc
and /sbin
are overlaid with bind mounts (17).
On non-A/B
system-as-root
devices, Magisk needs to be installed to recovery ramdisk
in order to retain systemless approach because boot.img
contains no ramdisk
(18).
Modules:
An additional benefit of systemless
approach is the usage of Magisk Modules
. If you want to place some binaries under /system/*bin/
or modify some configuration files (like hosts
or dnsmasq.conf
) or some libraries / framework files (such as required by mods like XPOSED
) in /system
or /vendor
, you can do that without actually touching the partition by making use of Magic Mount (based on bind mounts). Magisk supports adding as well removing files by overlaying them.
MagiskHide: (19)
Another challenge was to hide the presence of Magisk so that apps won't be able to know if the device is rooted. Many apps don't like rooted devices and may stop working. Google was one of the major affectees, so they introduced SafetyNet as a part of Play Protect which runs as a GMS (Play Services) process and tells apps (including their own Google Pay
) and hence their developers that the device is currently in a non-tampered state (20).
Rooting is one of the many possible tempered states, others being un-Verified Boot, unlocked bootloader, CTS non-certification, custom ROM, debuggable build, permissive
SELinux, ADB turned on, some bad properties, presence of Lucky Patcher, Xposed etc. Magisk uses some tricks to make sure that most of these tests always pass, though apps can make use of other Android APIs or read some files directly. Some modules provide additional obfuscation.
Other than hiding its presence from Google's SafeyNet, Magisk also lets users hide root (su
binary and any other Magisk related files) from any app, again making using of bind mounts and mount namespaces. For this, zygote
has to be continuously watched for newly forked apps' VMs.
However it's a tough task to really hide rooted device from apps as new techniques evolve to detect Magisk's presence, mainly from /proc
or other filesystems. So a number of quirks are done to properly support hiding modifications from detection. Magisk tries to remove all traces of its presence during booting process (21).
Magisk also supports:
That's a brief description of Magisk's currently offered features (AFAIK).
FURTHER READING: