Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I read the medium post (https://medium.com/replay-io/how-replay-works-5c9c29580c58), which gives an overview of how Replay works, but there are a few things I still don't understand.

1) How does the step backward functionality work? Do you take snapshots every so often of the Javascript environment? How do you handle destructive assignments?

2) Does Replay record actual syscalls made by the browser, or is it recording calls to the browser APIs by the javascript code (which I guess are effectively syscalls from the javascript code's perspective)?

3) The ordered lock technique described in https://medium.com/replay-io/recording-and-replaying-d6102af... makes sure that threads access a given resource in the same order, but what about threads accessing different resources in the same order? e.g. when recording, thread 1 accesses resource A before thread 2 accesses resource B. It seems like the ordered lock technique doesn't help you maintain that ordering in the replay. Is maintaining that kind of ordering across resources not actually necessary most of the time?



(Replay employee)

1. Rather than having to restore state to the point at the previous step, we can step backwards by replaying a separate process to the point before the step, and looking at the state there (this post talks about how that works: https://medium.com/replay-io/inspecting-runtimes-caeca007a4b...). Because everything is deterministic it doesn't matter if we step around 10 times and use 10 different processes to look at the state at those points.

2. We record the calls made by the browser, though it is the calls into the system libraries rather than the syscalls themselves (the syscall interfaces aren't stable/documented on mac or windows).

3. Maintaining ordering like this isn't normally necessary for ensuring that behavior is the same when replaying. In the case of memory locations, the access made by thread 2 to location B will behave the same regardless of accesses made by thread 1 to location A, because the values stored in locations A and B are independent from one another.


Thanks for the explanation! Do you ever run into performance issues with replaying from the start on each backward step or is this not really in issue in practice? I imagine for most websites and short replays it's probably fine, but for something like a game with a physics engine it sounds like it would be too expensive and you'd need snapshots or something. I guess that's a super small percentage of the market though.

For question 3 on the ordering, I was imagining the following kind of scenario: one thread maybe calls a system library function to read a cursor position and another calls a system library function to write a cursor position. So even though they're separate functions, they interact with the same state. Do you require users to manually call to the recorder library to give the recorder runtime extra info in this kind of scenario? Sorry if this is a dumb question, I haven't really done any programming at this level.


We definitely need to avoid replaying from the start every time we want to inspect the state at some point. This is kind of an internal detail, but we can avoid having to replay parts of the recording over and over again by using fork() to create new processes at points within the recording.

Ordering constraints between different library functions do crop up from time to time. In cases like this the recorder library uses ordered locks internally (basically emulating the synchronization which the system library has to do) to ensure that the calls execute in the expected order when replaying.


Oh that's cool, using fork() to create checkpoints. Thank you again for taking the time to explain!


Thanks for the links to the blogs. I was wondering how it worked and the "How it works" bit on that page said nothing. Nice that they've explained it. It looks like the blog does answer your questions though:

> The interface which Replay uses for the recording boundary is the API between an executable and the system libraries it is dynamically linked to.

I assume the ordered locks use a global order.


As bhackett confirmed, you're right about recording at the system library call level. I wasn't sure if it was more of an analogy or only referred to a version of Replay targeting backend servers written in other languages like Go, especially since the author mentioned hooking into the JS runtime in https://medium.com/replay-io/effective-determinism-54cc91f56.... But it looks like I misunderstood, and their browser product is their generic record/replay library integrated into Firefox, rather than a reimplementation of the same concepts.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: