Episode 1: Selenium vs Puppeteer discussion

Episode 1: Selenium vs Puppeteer discussion

Released Friday, 22nd December 2017
Good episode? Give it some love!
Episode 1: Selenium vs Puppeteer discussion

Episode 1: Selenium vs Puppeteer discussion

Episode 1: Selenium vs Puppeteer discussion

Episode 1: Selenium vs Puppeteer discussion

Friday, 22nd December 2017
Good episode? Give it some love!
Rate Episode
List

In this episode Neeraj, Rohit and Arbaaz discuss how selenium differsfrom Pupppeteer.

Links Mentioned in This Episode:

Transcript

Neeraj: Hi everyone, this is Neeraj from big binary and I’m here with two of my colleagues Rohit and Arbaaz, and here we are to talk about trinity radar, which is one of the applications we have been building for the last few months, and recently we made some changes. But first things first, let’s start with the introduction of what Trinity is. So Rohit, why don’t you go ahead and say a few things about what is Trinity and what problem it is solving.

[0:00:38] Rohit: Trinity is designed to make it easy for QA teams to record a right test and then you can execute it again and again to verify their notifications on the cloud. So record once and then execute it multiple times on any browser. So the way it works is that you record through a Chrome extension and the execution happens on the cloud with the help of selenium.

[0:01:04] Neeraj: So the important thing to note here is that when we are recording using extension no programming is record, right?

[0:01:09] Rohit: Right.

[0:01:10] Neeraj: So as the user is clicking through the webpages, behind the scene trinity is recording all the elements, the click, all the actions and then it automatically generates the code?

[0:01:26] Rohit: Yes. So there are two modes in the extension, one is assertion mode, in which wherever you click on the page, you will assert the text under that element, that this element should have this text present. The other mode is the browsing mode, whatever user actually perform click, typing, press the “Enter” button on your keyboard that it will record and everything will be recorded as steps and that can be replayed back[0:01:53] Neeraj: In the cloud when we are replaying all these actions, we are using selenium, right?

[0:01:58] Rohit: Right.

[0:01:59] Neeraj: Okay. So selenium is taking care of the stuff on the back end and on the front end we don’t have selenium. So Arbaaz, do you want to say how we were solving the problem and this is before we switch the gear.

[0:02:16] Arbaaz: Hi guys, myself Arbaaz, so I work on Trinity data extension. So on front end, we use both extension for recording the tests, like whatever you click we use extension to record everything, and we also have a local runner, in which whatever test you record runs on cloud and if any test is failing then we are running those tests in local runner. Now we use drivers like we inject scripts and execute those scripts like simulate similar selenium kind of environment in the browser itself.

[0:03:01] Neeraj: Okay, so when we are trying to emulate the behavior of selenium on the front end, on the browser side using javascript we are running into some challenges.

[0:03:10] Arbaaz: Yeah.

[0:03:11] Neeraj: Can you elaborate on what kind of challenges?

[0:03:14] Arbaaz: So there are many challenges here like it’s very difficult to simulate hovering. For example, how do you differentiate between a click and a hover, right? You move a mouse and you are trying to click somewhere and how you differentiate that hover. Hover is also a movement of mouse. So it’s a challenge there and the CSS attribute– CSS selectors which we need to record, which should be very unique to the whole page and the length of the CSS should be minimum and it should be traceable very fast. Sometimes the CSS changes and these are the few challenges like we have been dealing with.

[0:04:00] Neeraj: So to overcome these problems we decided to use webdriver.io, right?

[0:04:06] Arbaaz: Yeah.

[0:04:07] Neeraj: So say few things about what is webdriver and how does it solve our problem?

[0:04:15]Arbaaz: So the problem here, right now we encounter is the parity between whatever tests you record, run on cloud. Suppose you need to debug a single testcase which is failing on that cloud and you don’t know why it is failing, then you open a local runner and you try to debug it. Because we are trying to emulate these steps in browser the environments are not exactly same, right? So which is what webdrive.IO. So what we thought of like creating the same selenium using webdriverIO. WebdriverIO under selenium.

So we thought of implementing webdriverIO so that you can easily debug and test should run very fast compare to inserting and monkey patches to make things work. Then we encounter that there is new protocol by Google which is “Chrome DevTools Protocol” which is even faster than selenium. I think Rohit can elaborate more on Chrome DevTools and performance of DevTools over selenium.

[0:05:25] Neeraj: Okay, so before we go to CDP let’s just look at the full workflow so that everyone is on the same page as far as what we were trying to achieve using webdriver. So in the case of webdriver we have this… let’s say the script is that we need to go to a URL and click on a particular element. So we will pass on using javascript we’ll execute this command to give this command to webdriver and webdriver in turn will use selenium and will execute it locally in the browser, do I get that right?

[0:06:04] Rohit: Basically we will spawn chrome or whatever browser which you want. Then we will issue commands to selenium server which is just a http server. There are open endpoints, there are documents and you just hit the right endpoint for right kind of- for example, if you want to navigate to some google.com then you will head the // session go to your get/post request URL and it will navigate and you will give respond back that it is navigated to the URL. So this is how… so the javascript just call the browser and it initialize the browser and you will make http calls and you get the responds back. So this is how it works.

[0:06:53] Neeraj: Ok, so just to make it super clear, the difference between the world where we are using webdriver and when we are not using webdriver.io is that if I need to go to the URL in the javascript world in the non-webdriver mode you are directly executing the command to the browser that “Ok, go to this URL and click on this element.”

[0:07:19] Rohit: No. For example, incase of nonwebselenium you will execute a script where you change the window.location.href and when href is changed your browser will automatically redirect you to the required navigation. So this is how it works in nonwebdriverIO environment.

[0:07:40] Neeraj: So in the case of webdriverIO we give the command and this webdriverIO in turn sends a command to selenium and that’s what we mean by feature parity in the sense that at least the execution parity in the sense that now in the cloud also all the executions are happening through selenium and in the local… in your browser also things are being controlled by selenium. So selenium is acting up or something is not working right, we would be able to debug because we are more likely to reproduce the situation locally on the machine, right? Ok, fine.

so we were going down this path of implementing webdriver.io and then we came across CDP. So Rohit what is CDP and why it is so exciting?

[0:08:32] Rohit: Yeah, before start discussing about CDP I would like to add some more details on why doing things just in pure javascript was so difficult. So one of the things that Arbaaz mentioned is hover so I would like to give another example which is typing of text. So javascript is a sandboxed environment. The javascript events that we generate are not trusted except the [xx 08:56] which is the trusted event for [] so whenever we generate an event that object has a trusted property which is always falls so you can’t do base href javascript for example the way they simulate typing in pure javascript is that let’s say we have a text called “example” to type into a field for each character in the word example we are going to first send the keypress() event for each characters and then we dispatch and before doing that we also dispatch the before() input event. And then we send the keypress() event for that character and then we dispatch the input event bridge was added in DOM[xx 09:46] level 3 and then after all the typing is done, all the five characters have been send this way. They focus the element using javascript so this way these blur() and the focusout() event will be fired on that input because after you done typing let’s say you click on button.submit() then automatically you have focus out of that input field so that’s why I’m finding this event is also important.

So simulating is really hard. Typing will able to do that hover. I’m not sure, we haven’t done it, but hover is another… Issue with hover is that we can move the mouse on top of the element but the hover… if that element you know [xx 10:31] trigger dropdown when you move the mouse over that element that will not happen. The mouse was moved but the hover action was not triggered so that’s why doing things in pure javascript is not the right way.

[0:10:39] Neeraj: So let’s just look little bit deeper into it so once again what we’re saying is that let’s say there is browser is open and using the console we pass the javascript command and that will be executed in the context of the page and what you are saying is that if you make the mouse go over particular element we can do that but that will not fire a hover element because hover action because it is not a trusted event, right?

[0:11:15] Rohit: Right, the mouse will move where you want but it will not trigger the whole state of the element and for example, just think about it if you can move the mouse outside of the DOM maybe to the file dialog I don’t think that’s even possible with Javascript, but if that would have been possible that wouldn’t be really big security issue. That you are able to control the chrome window yourself. That’s why things are… for security reasons those events are not trusted.

[0:11:45] Neeraj: Ok, so the other things you mentioned about the input field so in this case it seems that when we are naturally we go to the web browser and we type in the input field and then we go to the next field and whatever the keypress, the blur and all those things need to be fired browser takes care of it. So when it takes care of it we don’t need to worry about it but in this case what you are saying is that if we simulate that using JavaScript then we need to take care of all these low-level events, meaning that ok now that I am finished typing let me fire onblur() and all those kind of things, right?

[0:12:33] Rohit: Right, exactly. So we have to figure out the orders in which these events are fired and then for each one character we need to fire those events so we need to do everything manually. In Selenium this happens through the individual drivers, for example Chrome driver would take care of it or in fact proxy driver would take care of it.

[0:12:50] Neeraj: So these problems that we talked about at last these problems will be taken care of if we use webdev.io, right?

[0:12:56] Rohit: Yeah, webdev.io also goes through the individual webdev implementation. For Chrome its chromedriver profile for [xx 13:04] for Iedge maximum came out with ISdriver. So since the commands will go through these drivers which also use CDP internally you know, those commands will be trusted and they will really perform the actions. So we don’t have to… we just have to say that type text into the field and that’s it. It’s going to work just like the user would have type that.

[0:13:28] Neeraj: So you mentioned CDP. What is CDP?

[0:13:30] Rohit: Yes, so CDP is a protocol which is used behind the scenes by the chrome web Inspector. So the Chrome developer tools use this protocol. CDP stands for Chrome Developer tools protocol- Chrome Devtools Protocol and anything that is implemented by the Inspector, the Chrome webinspector is you know, the front end is one aspect and the backend is all handled through the CDP protocol.

So it allows a lot of things not just you know, the basic actions on alignment like click on alignment, focus on alignment but it also allows the ability to do profiling of the page, do performance analysis of the JavaScript and the page.

The additional benefit of CDP vs Selenium even though Selenium in this case individual web developer implementation, let’s say chrome driver even though they also use CDP behind the scenes but web developer is the limited aspect, limited API. For example, one of the things that we had to implement is that detecting when we are not able to open a website. For example, localhost. If user extended types localhost and when we try to go to that URL, then we will receive “403 Not Found” or some messages [xx 14:48] which are not implemented in the web developer aspect. So we cannot… so we can send the command goto particular… but we don’t know what the status of that command is because that is not there

[0:15:01] Neeraj: So you mean to say http is the status of the response?

[0:15:04] Rohit: Right.

[0:15:05] Neeraj: Based on when we send the command and the command happens to be go to URL?

[0:15:09] Rohit: Right. So we don’t have that status in webdev.io or in Selenium and the way they said that to fix it we need to use the proxy. You know you have the proxy in-between and you send commands through it and from the proxy you can figure it out what are the return code and all that, but in CDP we have the ability to find the status code of the response.

[0:15:33] Neeraj: So this time we’ll say that CDP is what when we open the chrome developer console and we can do performance, we can do console, we can do elements and all the whole panel that that panel talks to the browser itself using this CDP, right?

[0:15:49] Rohit: Yes.

[0:15:50] Neeraj: So CDP is not a new thing. It has been there since the Chrome Developer Console came, but it’s just that now the Chrome team has been opening up CDP API more and more so that not only their own developer tool can use the protocol to talk to the Chrome browser but others can build tools to talk to the browser, right?

[0:16:10] Rohit: Yeah, I first learned about it when Chrome headless came out because the prime way to interact to Chrome headless is through CDP. So yeah Chrome headless is one of the best things that happened to Chrome and to CDP so we can use that browser and perform our actions more than just actions perform analysis for filing snapshots and other things.

[0:16:45] Neeraj: So one thing that is clearly… one more thing which was the future of webdev.io is that in the webdev.io we tell in which browser we are going to execute and we give the command and Arbaaz what does the webdev.io it maps the action to the right execution the command that need to be executed for that browser?

[0:17:04] Arbaaz: Webdev internally take cares of everything. Sо if thiѕ code dоеѕn’t wоrk [xx 17:16] browser then they hаvе this mарping so under the code thеy will make another API code and see thаt if it’s working оr nor. If it doesn’t wоrk and then they’ll leave you response… they’ll raise an error in case of command is not formed.

[0:17:34] Neeraj: OK . Sо wе hаvе Chrome browser аnd we can tаlk tо thе brоwѕеr tо iѕ, оnе iѕ uѕing webdev.io where the ѕсriрtѕ will bе еxесuted аnd thе оthеr one iѕ thrоugh CDP. Sо in thе case of CDP if we go with that route thеn аll our соde аnd еvеrуthing iѕ vеrу closely tiеd tо Chrome brоwѕеr, right?

[0:17:53] Rohit: Right.

[0:17:54] Neeraj: Sо during оur invеѕtigаtiоn I think Rohit you’ve роinted out that it ѕееmѕ likе maybe it’ѕ vеrу еаrlier ѕtаgе but looks likе Firefox аnd, I.E. nоt оnlу thеy’rе gоing tо ѕuрроrt thе WebDAV рrоtосоl but lооkѕ likе thеу аrе аlѕо looking аt ѕuрроrting CDP, iѕ thаt right?

[0:18:10] Rohit: Yeah, it looks like you knоw, I did ѕоmе ѕеаrсhing оn bugzilla оf default [xx 18:17] bugs tracker and fоund ѕоmе… for instance if they are diѕсuѕѕing imрlеmеnting ѕоmе оf the CDP commands so it may not be 100% parity to the protocol, but much оf thе thingѕ it looks like will bе imрlеmеntеd but also there are other libraries out there and there is a niсе рrоjесt оn github. The user is Chrome DevTools and the name of the [xx 18:49] is awesome-chrome-devtools which has links to various adapters which will convert from the CDP protocol to the native browser protocol. For example, if you’re using ibrowser then there is a library to convert the CDP command to the one which has been implemented to which maps closely implemented by “I,H” for example, a similar adapter exist for firefox. So I think it’s still early age for this but my guess is that eventually all of these browsers will have a matured protocol just like CDP it may not have 100% parity with that but they will be good [xx 19:26] from one command that works in [xx 19:30] and two [xx 19:31] just the exact same in let’s say firefox.

[0:19:34] Neeraj: So we have certified protocol and then Google also came up with the haedless mode. So now we can execute the browser in the headless mode. So with the headless mode and the CDP it seems like now instead of going through the Selenium route at least a possibility has opened up where we can run the automation test really fast and we can also talk to the browser. It also means that we no longer need to rely on solutions like a nightmare and phantomjs and anyway I think that these projects where kind of like they were not very well maintained for the last couple of years. Because it’s a lot of work maintaining big opensource applications like this, but now with the availability of headless chrome and the CDP we can control it so there was lot of tools were coming out and then Google came up with its own tool called puppeteer, right?

[0:20:33] Rohit: Right. So puppeteer is much high-level API. So if you are going to use CDP to interact with browser you should not use CDP directly you should use some high-level hierarchy. There are two of them. One is CRI, must be lower-level and one is puppeteer which has implemented things like the element handle object, the page object and the other thing which don’t exist, for example CDP- no direct mapping.

One example I can give is they have a command called batch for navigation. Then you navigate to a website they have a function a batchfornavigation() which you will do a batch for navigation to complete. They have implemented it with the help of some of the CDP commands as well as the javascript that you can execute on the browser. So a puppeteer and CRI are there but for us the right approach to use puppeteer.

[0:21:50] Neeraj: So when Rohit says CRI that stands for that they stand for Chrome Remote Interface that’s nodejs library by someone who is not part of the Google team and the earliest stages of puppeteer. Puppeteer was using CRI but as of today puppeteer no longer depends on CRI and it directly talks to the CDP. So if you need more control over how things are working and if we want the low-level control of them then one can use CRI, but otherwise puppeteer is pretty good.

I was watching some of the videos from Chrome Dev Summit which ended just yesterday and there was one presentation about the Puppeteer and the Chrome Remote Interface and how the Chrome team has been evolving it. So I post a link to that. That’s a very good presentation by Paul Irish, and his team and his co-worker about how the Puppeteer came about and what kind of problem they were solving. So on the front end side we were having this issue where we were initially executing everything using JavaScript and then we decided to start using webdev.io and then now we have decided to switch gear one more time and to use a CDP. Is that right?

[0:23:11] Rohit: Right.

[0:23:13] Neeraj: Okay.

[0:23:15] Rohit: There are many benefits to CDP. One is that we get the cutting edge new tech that Chrome developers are adding, you know with each of these in to the DevTools. For the example, we have the ability to emulate an Iphone or some other device. We will have the ability to emulate and then we [ xx 23:41 ] conditions and we can also do like I mentioned previously [ xx 23:46 ] and other things but, you know, on local error, you know web driver would have, but perfectly there might have been some speed issues that it won’t have been that fast like CDP. But using CDP, CDP uses… everything that which Selenium has, because Selenium you know, the individual web driver implementation for example, Chrome driver also goes to CDP. So everything eventually comes around to CDP but the Chrome driver only exposes a limit a set of API’s. By directly using the CDP, we get everything that is there in the webdev aspect and plus more like we are able to emulate a cutting edge, things that each release and also things that profiling and emulations.

[0:24:33] Neeraj: So by using CDP we are closely aligning the application with the Chrome browser and that would mean if tomorrow Firefox and IE support CDP then that’s good otherwise we might have to rely on tools which convert the CDP command to the other browser’s command in order to do the cross browser testing.

[0:24:50] Rohit: Right. so with the webdev.io in which it’s an interface to webdev aspect which basically inspect it post, it makes http post request which are there in the webdev aspect with that you can support any browser. You can support Firefox, IE and stuff like that. But in CDP there are some limits to chrome for [xx 25:21].

[0:25:22] Neeraj: Another reason for closely aligning as part of our internal discussion that came out is that with each passing week, each passing month, more and more browsers are getting similar in the sense that as far as End-to-End innovation testing is concerned. The return on investment is becoming lesser and lesser in terms of cross browser testing. Of course, for certain application is paramount except what is more important they need to run it and they should run it. But as part of our own work in Big Binary, we are seeing that compared to couple of years ago, we are doing less and less of testing on the eye side. So Abaaz, you are been doing front end work for number of years. Is that right? Like are you seeing this kind of trend where they cross the browser you have to do less and less of the fiddling and more and more things are getting similar?

[0:26:28] Arbaaz: Yeah, yeah. We define everything so you don’t have to write much code so otherwise you have to write IS specific scripts, IS specific CHS, now these Consortium, like together they are creating like similar environment everywhere. So it’s really good for us, for developers, so we don’t have to write much code for other browsers.

[0:26:51] Rohit: If you are supporting I edge, you are not concerned about browsers lower than IEdge. Like i9, i10 then I would say just by testing on chrome you know you recovered almost 95% because everything is standardized, browsers are no longer deviating that much. Still some issues are in there, but mostly everything is standardized. So [xx 27:22] website, web application on chrome definitely should work on iEdge and Firefox.

[0: 27:28] Neeraj: So initially when we started trinity project by that time we had two primary goals in mind one was that to make sure as part of the development or as part of the sending the new pullout request we are not breaking any of the existing functionalities as well as to make sure that the existing functionality that are develop they are working as expected in different browsers and I think that but this is the step what we are putting more emphasis on the first requirement which is that we want to make sure that as part of our day to day work we are not breaking anything rather than ensuring that the thing is working across because what we are internally as part of our discussion it came out is that we’re hoping that that will become the requirement tool will become less and less of a requirement in future. And so it’s better to put more emphasis in actually integration testing to ensure that things are not breaking in the Chrome itself.

[0:28:29]Yeah.

[0:28:30] Neeraj: OK. And you mentioned about the profiling we can get some information from CDP regarding the profiling of the browser but I think Chrome has a different project. Google Chrome team has a different product called Lighthouse. And initially I think they were bundled together but now Lighthouse is a separate open source library. They’re using lighthouse. We can get more of this information like once we run the automation test and after the test how much of the Javascript has been executed. How much of the CSS? How much is the time we took for painting and rendering and all the network events and everything. So I think some of that is available in CDP but I think a bulk of that has been moved to lighthouse so check out the lighthouse project.

[0:29:29] Rohit: Yeah, definitely. I haven’t looked much into Lighthouse. But like I mentioned the first page time or bundle navigation started all this we can get through ACDP protocol. It’s the profile you know object… but I will definitely check out the lighthouse.

[0:29:45] Neeraj: Ok, that sounds good, Abaaz, Rohit anything else that we should talk about but we haven’t talked about?

[0:29:51] Rohit: Yeah, one thing I would like to mention is that one of the shortcomings of CDP protocol was that it was not multiplexed, for example only one client could use it at a given time. So for example if you are controlling the browser through selenium which also goes to CDP, you cannot at that time open the web inspector because the web inspector also use CDP as selenium also use CDP, only one client at a time can use it.

And this was the issue for example if you are debugging something, if you have added some consoles in login statements [xx 30:43] and you would like to see what is being printed on the console, while Selenium is automating, is this controlling the browser? You can’t do it. I mean you can open the inspector but Selenium would stop. And when you close the inspector only then Selenium will continue from that point. So starting with the version 63 I believe of Chrome we will have multi backing supports in CDP. So that multiple times we will be able to drive the browser simultaneously. For example right now we have CDS puppeteer only one of the tools like puppeteer or CLI we can use at a time. But after this has been released we’ll be able to use both of these tools simultaneously. Which I think is going really good.

[0:31:33] Neeraj: So yesterday in the DevSummit, Paul Irish emerged that pull request during the presentation on the stage. So he said that there was one of the longest running issues and it has been there open for years of which yesterday he was happy to achieve the status to get fixed.

[0:31:51] Arbaaz: One thing I was excited about is PWA’s like progressive web applications. I haven’t watched the yesterdays DevSummit but it seems more and more people are moving towards PWA’s from native applications, right. So using CDP and testing PWA’s would be very easy in future like for example you have a PW application like Progresso for example in India we have Flipkart, housing all these websites where you… These applications works offline and they have their service workers, local store rate, their performance, painting. So everything can be tested like end-to-end using CDP protocol. There are many features coming soon are in the experimental phases. So in future testing these applications with various conditions how it works in offline, how it works in 2G, 3G, 4G and how different in various conditions can be tested very easily. So I’m pretty excited about that as well.

[0:32:57] Neeraj: Cool.

[0:32:58] Rohit: Yeah, I would like to add that puppeteer actually [xx 33:03] that pull request but 12 days back. It’s called set offline mode on the page object. So you’ll be able to emulate offline basically, offline mode.

[0:33:14] Neeraj: Oh, Interesting. The another issue that I’m monitoring on puppeteer is that there is an issue that sometimes we need to go one level deep, one level down and deal with the CDP at low level. But right now Puppeteer doesn’t allow that. So there’s an issue that will punch holes through puppeteers so that if needed we can get low level control and once that happens then that will be really awesome, that will make Puppeteer much more… I mean then we don’t have to depend on the Puppeteer to bring all these hot new features to High-Level if needed one can always step down to the CDP level and do all the things that CDP is making available.

[0:34:04] Arbaaz: Yeah.

[0:34:05] Rohit: Yeah.

[0:34:06] Neeraj: Oh, this has been interesting and thank you guys for joining call!

[0:34:11] Arbaaz: Yeah.

[0:34:12] Rohit: Yeah. Thank you Neeraj! Thank you Arbaaz! Once again thank you Neeraj for driving the discussion.

[0:34:15] Neeraj: Sure, thanks guys! Bye.

Show More
Rate
List

Join Podchaser to...

  • Rate podcasts and episodes
  • Follow podcasts and creators
  • Create podcast and episode lists
  • & much more
Do you host or manage this podcast?
Claim and edit this page to your liking.
,