▲ | sandGorgon 5 days ago | |||||||||||||||||||||||||||||||
is this open source ? just curious to see this. sounds fascinating! | ||||||||||||||||||||||||||||||||
▲ | dataviz1000 5 days ago | parent [-] | |||||||||||||||||||||||||||||||
What I have so far needs a lot of work and is flaky. Everyday it is getting tighter and better. Microsoft pulled out the lifecycle management code from Puppeteer and put it into Playwright with Google's copyright still at the top of the several files. They both use CDP. I'm using the Chrome extension analogue for every CDP message and listener. I need a couple days to remove all the code from the Page, Frame, FrameManager, and Browser classes and methodically make a state machine with it to track lifecycle and race conditions. It is a huge task and I don't want to share it without accomplishing that. For example, there is a system that listens for all navigation requests in a Page's / Tab's Frames in Playwright. Embedded frames can navigate to urls which the parent Frame is still loading such as advertising resources, all that needs to be tracked. There are a lot of companies that are talking about building solutions using CDP without Playwright and I'm curious how well they are going to handle the lifecycle management. Maybe if they don't intercept requests and responses it is very straight forward and simple. One idea I have is just evaluate '1+1' in the frame's content script in a loop with a backoff strategy and if it returns 2 then continue with code execution or if it times out fail instead of tracking hundreds of navigations with with 30 different embedded frames in a page like CNN. I'm still tinkering. Stagehand calls Locator.evaluate() which is what I'm building because I haven't implemented it yet. | ||||||||||||||||||||||||||||||||
|