6.9 KiB
Current status explanation replicated from Discord:
Overview:
Zygote vulnerability CVE-2024-31317 (https://github.com/agg23/cve-2024-31317/) is the core component of PenumbraOS, and the tool that grants us privileged access. We can't do much without it.
https://github.com/PenumbraOS/pinitd/ is designed around the vulnerability in order to provide persistence (running something, ideally all of our stuff, at boot) and repeatable privileged access. The intended flow is as follows:
pinitdActivity receivesBOOT_COMPLETED. This is a normal Android feature we have access to- Using
BOOT_COMPELTEDas the trigger, we start the actualpinitdprocess asshellusing the vulnerability (the same as if you were to connect viaadb shell) as this is what we're most used to. This contains restart counting logic to prevent infinite boot looping if something breaks - This has not been wired up due to not having the boot loop protection pinitdstarts thecontrollerprocess, which, as the name suggests, is the central orchestrator for everything.controllerwould normally request the start of theworkerpinitdinstance undersystem(uid 1000) using the vulnerability, but I've found that we might not ever use uid 1000 and thus there may be many workers, so this needs to be rearchitected. This is disabled for now as it just breaks Zygote more (see below)controllerstarts any autostart services, dispatching events to the workers as necessary
Exploitation of the vulnerability requires inserting a custom Zygote command payload via setting the global setting hidden_api_blacklist_exemptions (read more in the vulnerability link above). The execution path is:
- Write
hidden_api_blacklist_exemptions - This is processed and immediately passed over a socket to Zygote. We try to slow this down as much as possible by making the command long and with many commas so we have better timing
- Zygote reads the actual
--set-api-denylist-exemptionscommand, gets confused, then starts to read our injected command - Our injected command requests a process spawn with whatever settings are necessary. It specifies at least 1 argument (line) extra will be present
- Zygote keeps reading our command, then gets to the end of the current data in the buffer, sees that theres at least 1 argument more that's expected, then tries reading from the socket again
- This read DOES NOT RETRY and does not block. If there is not immediately data available, Zygote will crash - this is our major timing constraint
- We state we're passing one more argument than necessary because each interaction with Zygote reads and writes some control bytes (send command, wait for 4-5 bytes of status update back from Zygote). For a process spawn, Zygote reports the newly spawned pid and whether or not the process is wrapped (doesn't really matter). If we were to just execute the vulnerability on it's own, Zygote would write 5 bytes into it's response buffer as a result of the spawn, but nothing would ever receive the data, so things would get all out of order.
Thus we state we have more arguments than we actually have and send another, real process spawn immediately after. Zygote will eat these arguments, but AMS (the sender) will expect a process spawn, will read the pid bytes, and will apply some necessary permissions for us. 8. At this point we are done executing the vulnerability and can clean up by clearing the hidden_api_blacklist_exemptions setting. This will also send another command to Zygote that needs to be processed properly
What is wrong
Unlike the original finders of the vulnerability, the Ai Pin has two versions of Zygote running; one for 64 bit, one for 32 bit. This is more common on older devices, and each Zygote spawns process of the corresponding bitness.
When we set hidden_api_blacklist_exemptions, that write goes to both Zygotes. Both Zygotes attempt to perform a process spawn, and both Zygotes look for an extra argument (which we specified). This attempts to read from the socket again.
As I said above, the socket read does not block and does not retry. On failure, it instantly crashes Zygote. This would be fine, since we don't need 32 bit Zygote. But initd (not pinitd) is configured to automatically restart the entire Android system if either of the Zygote instances fails, which means we end up essentually rebooting and starting over again. This is problem #1.
I spent a month or more unaware of this issue because of a quirk of timing. If you send the hidden_api_blacklist_exemptions clear quickly enough after you send the dummy process spawn:
- Write
hidden_api_blacklist_exemptions - Immediately send
am start -n com.android.settings/.Settings - Immediately clear
hidden_api_blacklist_exemptions
The clear actually gets interwoven in the Zygote response stream. The commands do not get processed out of order (and thus 32 bit Zygote eats the clear, rather than a process spawn), but the responses do. Something like this tends to happen:
- Zygote receives
hidden_api_blacklist_exemptionsset (but doesn't yet start waiting on status code) - Zygote starts process spawn and sender starts waiting on 5 bytes
- Zygote sends 4 byte status code
- Sender reads those 4 bytes (probably 0) and considers that the pid. It still needs 1 more byte
- Process spawn completes, Zygote sends 5 bytes (pid int and wrapper boolean byte)
- etc...
This results in the response bytes from Zygote being all messed up and all out of order. This usually wouldn't be a problem, but AMS won't let you have two processes registered with the same pid (for obvious reasons), so once your pid starts to overlap with existing processes, you can no longer spawn anything as AMS will promptly kill it. This is problem #2.
Somehow we need to figure out how to balance these. We can have perfectly clean spawns like Meta describes in the original exploit, but Zygote32 kills us. We can prevent Zygote32 from killing us, but we goof up the output stream and everything gets confused
What I have tried
- I've messed with a bunch of timing changes, but it's hard to know what to try. This doesn't replicate in emulator for obvious reasons, so I don't have in-depth logging
- I tried sending a spawn for separate 64 and 32 bit apps at the "same time", but it didn't appear to work. I think AMS probably only allows one spawn at a time (even though they're going to separate Zygotes), so it will wait until the 64 bit process has reported its 5 bytes before sending the 32 bit spawn
Besides the levers for sending these commands, we also have the ability to manipulate how quickly Zygote responds to the process spawn. When it spawns, it waits for the spawned process to write back the spawned pid (https://github.com/PenumbraOS/pinitd/blob/master/pinitd/src/zygote.rs#L24-L35) up to some fairly large timeout (maybe 5s?). So we can delay things here if we so desire, but I haven't found anything too interesting here