The purpose of async is mostly to avoid OS threads, and rust decided not to go down the route of implementing user space threads.
Instead, for async, rust implemented the ability to basically encode a functions stack frame and instruction pointer into a "normal" (but opaque) struct. What an async runtime like tokio does is (through a few levels of useful indirection that I won't talk about) store a list of these structs, and decide when it's a good idea to "call" one of them. When called, the structs either return a final value, or return a value saying "call me again later", in which case the runtime presumably puts it back into it's list of structs and calls it again sometime later.
Figuring out when to call it is left up to the runtime, but the useful ones will do things like record what operation it's waiting for and call it when that operation is ready.
> rust decided not to go down the route of implementing user space threads
Rust had (optional) user-space threads a long time ago, but that was removed in the pre-1.0 days as it added a lot of complexity and had some unavoidable performance loss even when opting for native threads (it forced dynamic dispatch on anything related to threading or I/O). There was a lot of discussion here but eventually it was declared that the OS thread scheduler was in fact perfectly capable of handling large numbers of threads and that virtual memory mapping meant the stack space allocation for each thread wasn’t a big deal and so green threads were removed.
Instead, for async, rust implemented the ability to basically encode a functions stack frame and instruction pointer into a "normal" (but opaque) struct. What an async runtime like tokio does is (through a few levels of useful indirection that I won't talk about) store a list of these structs, and decide when it's a good idea to "call" one of them. When called, the structs either return a final value, or return a value saying "call me again later", in which case the runtime presumably puts it back into it's list of structs and calls it again sometime later.
Figuring out when to call it is left up to the runtime, but the useful ones will do things like record what operation it's waiting for and call it when that operation is ready.