This tutorial is about the very important Actions Class in Selenium.
From previous tutorials, it is seen that most of the web page interactions needed can be performed with WebDriver Element Commands like clicking on a button using click() method, entering text in textbox using sendKeys(), etc. In fact, even form can be submitted using the submit() command. Also, to perform interactions with drop-down list , WebDriver provides Select Class which has methods like selectByValue(), deselectAll() etc.
In short, almost all interactions with web page elements can be performed; then one must be wondering why do we need one more separate class like Actions class? Well, it’s true that web driver command provides a fairly sufficient set of commands to perform basic interactions. But, in many cases, these basic interactions are not enough, automation script needs to perform more complicated interactions than these.
Let’s consider a case where a user needs to perform a double-click on the element, or maybe Drag-n-Drop the element or even emulate user action of entering uppercase letters in the textbook using Shift+Letters. These are the cases where some advance API methods are required which can handle complex interactions. And that is the whole purpose of having Actions class in place.
In this tutorial, we are going to cover this one of the important class in Advanced User Interactions API of WebDriver i.e. Actions class in Selenium.
What is the Actions class in Selenium?
If we look at the class description for Actions Class it says:
The user-facing API for emulating complex user gestures. Use this class rather than using the Keyboard or Mouse directly.
Implements the builder pattern: Builds a CompositeAction containing all actions specified by the method calls.
Here, if we look at the first sentence of the class description, it must be clear that Actions class is an API for performing complex user web interactions like double-click, right-click, etc. and it is the only choice for emulating Keyboard and Mouse interactions.
Actions class implements the builder pattern. Builder Pattern is one of the design patterns. The intent of the Builder design pattern is to separate the construction of a complex object from its representation. Actions class also Builds a CompositeAction containing all actions specified by the method calls. Confused, let us take an example:
Consider the construction of a home. Home is the final end product (object) that is to be returned as the output of the construction process. It will have many steps like basement construction, wall construction and so on roof construction. Finally, the whole home object is returned. Here using the same process you can build houses with different properties.
There is a huge collection of methods available in this class. Below screenshot represents all of those methods which are marked as blue & orange.
Also, an important thing to bring here is that there is one another class which is called Action Class and it is different from Actions class. Because maybe you have noticed the top blue line in the above screenshot, the build method returns Action class. But then what is Action class and how does it different with Actions Class. Let’s have a look.
What is Action Class?
Did I mention Action Class, actually it is not a class but an Interface.
It is only used to represent the single user interaction to perform the series of action items build by Actions class.
What is the difference between Actions Class and Action Class in Selenium?
With the above explanations of Actions Class & Action Class, we can now conclude that Actions is a class which is based on builder design pattern. This is a user-facing API for emulating complex user gestures.
Whereas Action is an Interface which represents a single user-interaction action. It contains one of the most widely used methods perform().
How to Use Actions class in Selenium?
Let’s understand the working of Actions class with a simple example:
Consider the scenario where it is required to enter Upper Case letters in the text box, let’s take text box on ToolsQA’s demo site https://demoqa.com/autocomplete/
Manually, it is done by pressing the Shift key and then typing the text which needs to be entered in Uppercase keeping Shift key pressed and then release the Shift key. In short Shift + Alphabet Key are pressed together.
Now, to emulate the same action through automation script, Actions class method is used:
Actions class & Action class reside in org.openqa.selenium.Interactions package of WebDriver API. To consume these, import their packages:
2. Instantiate Actions class:
Actions class object is needed to invoke to use its methods. So, let’s instantiate Actions class, and as the Class signature says, it needs the WebDriver object to initiate its class.
Actions actions = new Actions(webdriver object);
3. Generate actions sequence: Complex action is a sequence of multiple actions like in this case sequence of steps are:
- Pressing Shift Key
- Sending desired text
- Releasing Shift key
For these actions, Actions class provides methods like:
- Pressing Shift Key : Actions Class Method => keyDown
- Sending desired text : Actions Class Method => sendKeys
- Releasing Shift key : Actions Class Method => keyUp
The keyDown method performs a modifier key press after focusing on an element, whereas keyUp method releases a modifier key pressed.
A modifier key is a key that modifies the action of another key when the two are pressed together like Shift, Control & Alt.
Generate a sequence of these actions but these actions are performed on a webElement. So, let’s find the web-element and generate the sequence:
WebElement element = driver.findElement(By strategy to identify element);
An important thing to note here is that, if you hover over any action class method, you will notice that it returns the Actions class object.
This is the beauty of the builder pattern. Which means that all actions can be clubbed together as below:
4. Build the actions sequence:
Now, build this sequence using the build() method of Actions class and get the composite action. Build method generates a composite action containing all actions so far which are ready to be performed.
Action action = actions.build();
Notice that the build method returns the object type of Action. It is basically representing Composite Action which we built from a sequence of multiple actions. So, the second part of the Actions class description will get clear now, i.e. Actions class implements the builder pattern i.e. it Builds a CompositeAction containing all actions specified by the method calls.
5. Perform actions sequence: And finally, perform the actions sequence using perform() method of Action Interface.
And this is done, once the execution passes this point, you will notice the action on the browser.
Same steps need to follow to leverage Actions class methods for performing different complex combinations of user gestures.
Catch in Actions Class:
In the above screenshot of Actions Class, where I highlighted all the different methods, take a look at the bottom blue line. Notice the Perform method is available in the Actions class as well. It means that it can be used directly as well without making the use of Action Interface like below:
I know you must be wondering what happened to Build step. Again, this is the charm of the builder pattern, build method is called inside the perform method automatically.
Methods in Actions class of Selenium
There are a lot of methods in this class which can be categorized into two main categories:
- Keyboard Events
- Mouse Events
Different Methods for performing Keyboard Events:
- keyDown(modifier key): Performs a modifier key press.
- sendKeys(keys to send ): Sends keys to the active web element.
- keyUp(modifier key): Performs a modifier key release.
Different Methods for performing Mouse Events:
- click(): Clicks at the current mouse location.
- doubleClick(): Performs a double-click at the current mouse location.
- contextClick() : Performs a context-click at middle of the given element.
- clickAndHold(): Clicks (without releasing) in the middle of the given element.
- dragAndDrop(source, target): Click-and-hold at the location of the source element, moves to the location of the target element
- dragAndDropBy(source, xOffset, yOffset): Click-and-hold at the location of the source element, moves by a given offset
- moveByOffset(x-offset, y-offset): Moves the mouse from its current position (or 0,0) by the given offset
- moveToElement(toElement): Moves the mouse to the middle of the element
- release(): Releases the depressed left mouse button at the current mouse location
To see the complete list of all methods visit https://seleniumhq.github.io/selenium/docs/api/java/org/openqa/selenium/interactions/Actions.html
In the next tutorials, we will do a lot of practice using Actions Class. Please take a look at the below tutorials:
- Right-click in Selenium
- Drag-n-Drop in Selenium
- Verify Tool-Tip in Selenium
- Double-click in Selenium
- KeyBoard Events in Selenium