Determining Strategies on Playing Badminton using the Knuth-Morris-Pratt Algorithm

Mastery techniques in badminton game are a main ability that must be possessed by players. One part of these techniques is the strategy in proper shuttlecock placement, so that the opposing player is difficult to restore it. Therefore, this study aims to build a computational model and its implementation that are able to provide predictions/recommendations for trainers and players on determining strategies of shuttlecock’s placements and strokes. The proposed model takes into account historical game patterns that have been done by world class athletes. Then, string matching using the Knuth-Morris-Pratt algorithm and a clustering method are utilized to provide solutions to be some strategies on shooting the shuttlecock. The model is then implemented in the R programming language. Several experiments, involving 20 series of world matches collected as historical data, have been conducted to validate the system. From the results obtained, it can be concluded that the system can be used as an alternative tool for players and coaches to determine the strategy in the placement and strokes of shuttlecock on badminton game.


Introduction
Nowadays, the use of sport science can be found in various literatures.Here are some examples of the use of sports science: research conducted by Manuel [1] which uses cooperative game theory for determining strategies in sport, video games used for health and physical education [2], and data mining on brain activity using electroencephalography (EEG) for dealing with sport science problems [3].Moreover, analysis on the development sporting activities of German first Bundesliga professionals was presented in [4].Basically, sport science can be defined as applying science principles to help improving sports performance [5].Thus, it can be said that sport science is a combination of several fields, including physiology, psychology, mathematics, computing, etc.
One of the popular game sports in the world is badminton.It can be seen from several games held every year, such as Olympics, Asian Games and SEA Games where it is always contested.In the badminton game, one of the keys to victory is a combination of tactics and strategies in returning shuttlecocks so that it is difficult to reach by the opponents.For example, various types of shuttlecock shots that can be applied in a badminton game are drive shot, clear/lob, dropshot, netting, and smash.Accuracy, speed, and combination of shots are an important part of a strategy to win a badminton game.Therefore, it can be understood that sport science is required to win the bádminton game.For example, it might be a recommendation system for players and trainers to get a good alternative strategy.
In previous work regarding sport sciences on badmiton, some approaches have been introduced.For example, El-Gizawy [6] was to identify the effect of visual training on the accuracy of bádminton attack shots.The effect on the biomechanics of backhand, short serve, and smash for treatment and prevention of injury was investigated in [7].In 2009, the shuttlecock's trajectory and relationship between the air resistance force and shuttlecock's speed are simulated and validated [8].The results of this research showed that the drag force  ISSN: 1693-6930 was proportional to square of a shuttlecock velocity.Moreover, detection and classification of the court, players, strokes, and players strategies have been done in [9].According to these simple reviews, we can state that there is no a study proposing a system that is able to determine strategies on the shuttlecock's placement and strokes.
Therefore, this research is focused on building a computational model that is able to provide predictions/recommendations to players and coaches for determining the proper shuttlecock placement.It consists of several aspects, namely the determination of the zone and the type of shot in the game badminton, data collection from the game world popular players as historical data, string matching using the Knuth-Morris-Pratt algorithm [10], and decision-making strategy using a clustering method [11], namely the K-mean method for discrete data [12].The model is then implemented into a recommendation system by using R, which is a programming language and its ecosystem providing many packages for statistical analysis, machine learning, graphics, etc [13].By entering simple patterns/sequences representing zones and shot types in badminton, the system is able to predicts next shots as a recommendation for trainers and players.
The remainder of this paper is structured as follows.Section 2 provides the research method containing the research design and introduction to string matching using the Knuth-Morris-Pratt algorithm.One of contributions of this research is presented in section 3, which is a construction of computational model and its implementation.To test the proposed system, section 4 performs experimental study explaining data collection and some scenarios.Results and analysis is presented in section 5 while section 6 is to conclude the research.

Research Method 2.1. Research Design
There are several steps to conduct this research as illustrated in Figure 1.After defining the research preparation (i.e., problem identification and formulation, research objectives, and literature study), we construct a computation model for determining strategies on placement of shuttlecock shots.Basically, it consists of four main stages, as follows: determining the zone and type of strokes, preprocessing data, matching sequence by using the Knuth-Morris-Pratt algorithm, and then making decision.Detailed explanation of the model is presented in the next section.The next step is to implement the model by writing code in R programming language.After developing the system, some experimentations are conducted.The data used in these experiments are obtained from the world badminton competitions recorded in http://youtube.com.Some analysis according to results are presented to draw some conclusions.

String Matching Using the Knuth-Morris-Pratt Algorithm
String matching is the process of finding certain string patterns from large text volumes [14].So, it is an algorithm for searching all occurrences of short strings (i.e., patterns) in longer strings of text.There are many approaches and implementations of string matching proposed, such as a practical comparative study between syntactic and semantic techniques was presented in [15], a universal string maching has been introduced in [16], and the algorithm Boyer Moyer was used for dealing with intrusion detection system [17].Moreover, the string-matching method called Levenshtein distance was used for spelling correction system with the ability to fix real word error and non-word error ends [18].
Broadly string matching algorithm is divided into 2, namely: Exact string matching and approximate string matching or inexact string matching [19].The first one refers to matching strings correctly with the order of characters in matching strings having the number and sequence of characters in the same string while the second one means string matching of a string where matching strings have similarities where they have different character sets (possibly numbers or sequences) but they are similar in approximate string matching and phonetic string matching).It should be noted that this study focuses on the first category of string matching.
Several algorithms can be entered into the exact string matching category, such as the Knuth-Morris-Pratt algorithm [10], which is the focus of this research.The Knuth-Morris-Pratt algorithm was developed by D. E. Knuth, together with J.H Morris and V. R. Pratt in 1977.In the algorithm, the unbalanced pattern information with the text used is stored to determine the number of shifts.In other words, unlike the Brute Force algorithm that matches strings by checking and shifting every single character, the Knuth-Morris-Pratt algorithm of unmatched pattern information with text is stored to determine the number of shifts.Thus the Knuth-Morris-Pratt algorithm performs a further shift in accordance with the stored information, which results in significantly reduced search time.Meanwhile, the brute force algorithm (i.e., naïve algorithms) has the complexity of O(mn) because it matches all possibilities of each character in the text [20].The complexity of the Knuth-Morris-Pratt algorithm is O(m+n).Because of the limited space, we cannot completely explain the algorithm, but interested readers can find the algorithm in [10,21,22].

The Construction and Implementation of the Computational Model
The computational model constructed in this research can be seen in Figure 2. It contains the following main phases.

Determining Model of Zone and Shot Type in Badminton
Before computing and making some recommendation, we need to build a model for representing badminton match into datasets.In this research, we créate two models as follows: a model for representing sequences of zone where shuttlecock shots are placed in game and a model for expressing types of shots.First, Figure 3 illustrate the model of the zone.It can be seen that on each field we divide the following 9 zones: "A", "B", "C", …"I".These zones have been validated by expert users.It should be noted that the model is only used for badmiton with a single player.By using these zone, now we can represent the game by sequence of the strings.Then, we define several type of shots as follows: drop shot, lob shot, smash, and netting represented by "1", "2", "3", and "4", respectively.

Collecting and Preprocessing Data
After defining the zone and shot types, we can collect data.There are two types of data: data training and pattern/input.The first data are recorded from the world badmiton competition.For example, the game of Lee C. W. vs Lin Dan in All England Open 2012 was obtained from https://www.youtube.com/watch?v=TRNKfBmCa8M&t=499s.For each point obtained by both players, we collect the data as the previous model.For example, we write the following sequence: "(A,4), (I,2), (E,1), (D,2), (C,1), (H,2)".This sequence means that the first player put the shuttlecock to the zone of "A" on the opponent's zone with the shot type "4", then the second one replies with the zone "I" and the shot "2".We repeat the interpretation until one of the ISSN: 1693-6930  Determining Strategies on Playing Badminton using... (Lala Septem Riza) 2767 players obtains the point.So, it can be understood that for one serie we might obtain many rows where each rows represents one point.Moreover, in order to be easy to calculate we save the data into a simple format, e.g., A 4 I 2 E 1 D 2 C 1 H 2 for the above sequence.
The second data are short sequences of the patterns inputted by users (e.g., coach and player).The data are used as input data.It means by this pattern as the input data we require to get a recommendation containing next shots and their shot types.For example, the pattern is "(A,4), (I,2),(E,1)" as the input data and we need to obtain 2 next shots as a recommendation, then the solution could be "(D,2), (C,1)".

String Matching with the Knuth-Morris-Pratt Algorithm
In the second phase, we can imagine that the dataset obtained contain more than hundreds for each match.Because of that, we need to utilize an algorithm to perform string matching between data training saved in database and pattern inputted by users.After obtaining some strings that match with the pattern, we then take into account their next shots for some recommendations.In this case, we utilize the Knuth-Morris-Pratt algorithm as illustrated in the previous section.

Making Recommendations on Next Shots
By performing the algorithm prevously, strings that are match as results could be not a single sequence.If the solutions obtained by the algorithm are less than and equal to numbers of next shots defined by users then the solutions are just produced and printed.Otherwise, we need to execute the clustering method to choose the proper solutions.The reason why the method are used is to have main and most majority of string sequences that are represented by cluster centers.In this research, we select the K-mean method.Furthermore, for simplicity, we just execute a function included in the kamila package [12].It should be noted that the clustering method used is for the discrete dataset since sequences of zone and shot type are expressed in string/discrete.

Experimental Study
To test and validate the computational model and its implementation, we perform some experiments.The following sections explain data and scenarios used in the experiments.

Data Collection
As we mentioned previously, datasets included in database are obtained from the world badmitton games as described in Table 1.Because of the limited space, we just show 7 of 20 games considered in the experiments.From these games, we manually recorded the games into sequences as the proposed model.Finally, we obtain more than 500 rows containing sequences of the games that can be saved into the .csvfile.

Scenarios
In the experiments, we perform two scenarios: fitting and testing.In the fitting experiments, we test the implementation by inputting the patterns that are chosen from the data

Figure 1 .
Figure 1.Research design of the system for determining the playing strategies in badminton

Figure 2 .Figure 3 .
Figure 2. The computational model for determining strategies on the placement of shuttlecock shot

Table 1 .
Datasets used in the Experiments