Pool Table Analyzer SDP09 Team Mettu
Transcription
Pool Table Analyzer SDP09 Team Mettu
1 Pool Table Analyzer SDP09 Team Mettu Dave Fraska, Doug Frazer, Mitchell Kendall, and Timothy Langlois Abstract—The Pool Table Analyzer is designed to enhance the experience of a game of billiards. It will analyze the positions of the balls and suggest the best shot to each player. This will help amateur players to think about advanced concepts, such as cue ball placement, that they do not normally consider. The system will have an intuitive interface which anyone will be able to understand. Four cameras will be mounted above a pool table and connected to a computer. The computer will analyze the images from the cameras to suggest the best shot. The computer will also use the images from the cameras to track the location of the cue stick as a player makes his/her shot. I. I NTRODUCTION T HE Pool Table Analyzer is a system designed to watch a pool table as a game is played. The analyzer will suggest the best shot possible to each player on their turn. This will enhance the game experience for players. A. Problem Statement A system is to be designed that will track the placement of pool balls in a game of pool. The system will allow users to see, on a screen, what balls are remaining, where they are on the table, whose turn it is, and the score in real time. There will be a user interface which will show the best shot for the active player and the consequences of such a shot. A system as such could attract more players to a hall, not having to worry about remembering the rules, whose turn it is, and other tedious tasks during game play. Similar systems are in place in bowling alleys, to keep track of scores. Such a system will lower the learning curve for new players, allowing them to concentrate more on making shots and less on the game play mechanics. B. Context of Project Our project is designed to be used wherever pool tables are used, including pool halls, bars, and homes. The installation will be fairly simple, mostly consisting of mounting the webcams on the lighting fixture. The installation will be simple enough that home users can install it themselves. Anyone who plays pool will be able to use the system. C. Requirements Specification • • • The analyzer shall suggest the best shot to each player The pool cue module shall communicate with computer up to 20 ft away The pool cue module battery shall last for at least 10 hours • • The analyzer shall be able to place the cue to make straight shots with 5cm accuracy The system shall make a shot suggestion within 5 seconds of when all the balls stop moving II. D ESIGN A. System Overview Our system will consist of four webcams mounted on the lighting fixture above a pool table. The webcams will be connected to a computer using usb cables. The computer will analyze the images from the webcams to locate the balls and the cue stick. The system will determine the positions of the balls and then decide the best shot the player can make, based on difficulty and usefulness of the shot. B. Block Diagram Our block diagram is shown in Figure 1 on page 2. C. System Specification 1) Cameras: Our project will use four Logitech Quickcam Pro webcams. Each webcam will cover one quadrant of the billiard table. With a resolution of 1600x1200 each, the cameras will give an accuracy of about 1 pixel per mm. The cameras will constantly send images to the image recognition software, which will process the images to determine the locations of the balls. The cameras will be mounted in the lighting fixture above the table. 2) Image Recognition Software: The image recognition software is responsible for analyzing the images from the webcams. There are three objectives this software must accomplish. The first is motion detection. It must detect when balls are moving and when they stop: this signals when players are changing turns, which is when the system must calculate the best shot. The second objective is determining the positions of the balls on the table. This will be accomplished by using the Hough transform. We are using an open source implementation of the transform which we tuned to more accurately detect the balls [1]. The third objective is to determine the location of the cue stick when the player is taking their shot. This will also be done by using the Hough transform provided by OpenCV [1]. This will be used to send signals to the cue stick module to tell the player if their shooting angle needs to be adjusted. We have designed a flow chart of how our software will operate. Please refer to Figure 2 on page 3. 2 Fig. 1. Block Diagram 3) Best Shot Suggestion Software: The purpose of the best shot suggestion software is to suggest the easiest shot that leaves the table in a useful state. This piece of the software is based on the algorithm developed by Michael Smith in his Master’s Thesis [2]. The best shot detection software will have two main stages. The first is move generation, where possible shots are found. Each class of shots will be searched separately, in order of difficulty, so all straight shots will be found first. To generate straight shots, a shot will be considered for each ball for each pocket. If the ball is between the cue ball and the pocket, and there are no other balls in the way, the shot is possible. Since the difference in difficulty is so large between classes, if any straight shots are found, other classes will usually not be searched, unless the available straight shots are fairly difficult. The second stage is best shot suggestion. All the possible shots will be evaluated based on difficulty and usefulness, i.e., how much does this shot help the player? Usefulness is mainly based on the cue ball position after the shot. Difficulty of a shot depends mainly on four variables: the cut angle, the object ball-pocket angle, the cue-object ball distance, and the object ball-pocket distance. Difficulty values will be precomputed by setting up shots with certain values for these four variables and then simulating the shots after adding noise. To suggest the best shot, a Monte-Carlo search algorithm will be used. For each possible shot, the software will determine the parameters needed to make the shot. Then zero-mean Gaussian noise will be added to the parameters to simulate the player. The noise will create i possible outcomes for each particular shot. The move generation algorithm will be run on each possible outcome state, the possible shots will be sorted based on difficulty, and the easiest 3 will be chosen. The usefulness score for outcome state i will be ui = d1 ∗ p1 + d2 ∗ p2 + d3 ∗ p3 (1) where d1 , d2 , d3 are weighting factors. Most of the time the easiest shot will be the best shot to take, but having other good shots on the table is good also. We are starting with d1 = 1, d2 = .33, d3 = .15, but we may tweak these. The average of the ui , uavg is the usefulness score of the shot. The score of the shot is s = uavg ∗ p (2) where p is the difficulty (probability) of correctly making the shot. The shot with the highest score is the best shot. This software also uses a physics library designed specifically for billiards [3]. The poolfiz library is a free, closed- 3 Fig. 2. Flow Diagram source library for simulating billiards. This library also uses openGL to draw the table and balls as well as animate shots. 4) Pool Cue Positioning Software: After the best shot is found, the pool cue positioning software will calculate the angle the cue should be in to execute the shot. As the player positions the cue stick, the system will monitor the position and send the data to the pool cue positioning software through the cameras and image recognition software. It does this by using the hough transform provided by OpenCV. The line that is closest to the cue ball is the line that is chosen as the cue stick. The pool cue positioning software will calculate whether the pool cue needs to rotate left or right and send this data through the bluetooth to the cue. 5) Graphical User Interface: In order to combine all of our different libraries together we have designed a graphical user interface. This interface uses the poolfiz library to draw the pool table and animate the pool balls being hit. This library is written in OpenGL using the GLUT toolkit. We draw the pool table into one frame and then draw more information specific to our application in the surrounding frames, for example information about the current angle of the pool cue or the velocity that the player should hit the ball at. There is also a series of buttons that allow the user to interact with the user interface through their mouse. We provide to the poolfiz library the positions of all the pool balls that we have found and place their positions accordingly, and then allow the user to either design their own shot and animate it to see what will happen or have our AI suggest a shot to them. After animation the pool table in the user interface will revert to the current table state that is representative of the table. During this mode of user interaction the library is also constantly polling the position of the cue ball to see if it has moved, which would imply a shot has been made. If this event occurs, then the state of the user interface will change to an idle state while it waits for the pool balls on the table to stop their movement. Both the detection of the cue ball motion and the motion of all other balls on the table is done by doing simple RGB comparisons of low resolution images of the table. Once the user interface has decided that the table is in a stable state, it will return to the original user mode state with an updated view of the table representing the current state of the table. The user interface will also compare the current state to the previous state to check if any balls were scored and if they were it will notify the user of the current turn based on which balls were sunk. 6) Bluetooth Transmitter: The transmitter is a generic Bluetooth USB dongle using serial port protocol. It will receive signals from the pool cue positioning software and relay the signals to the bluetooth receiver on the cue stick. 7) Bluetooth Receiver: The bluetooth receiver is the RN41 from Roving Networks. It will receive the control signals from the bluetooth transmitter and send these signals to the microcontroller. 8) Microcontroller: The microcontroller will also be located on the cue stick. It is an ATmega168. It will decode the signals from the bluetooth receiver and determine which LED should be lit up. 4 9) LEDs: The three LEDs will be mounted on the front of the cue stick. They will show the player which way he or she should rotate the cue stick to have the best chance of correctly executing the shot. D. Design Alternatives We originally wanted to use only one camera. This would have simplified the fixture, and also the image detection because we would only have one image to process. However, after completing the error analysis, we realized that one web camera would not give us enough accuracy, so we decided to use four. Also, we planned on using the GTK+ library to build the GUI. However, we discovered that the GTK+ library was very poorly documented and difficult to use. We found another library called Qt that did the same thing but was better documented and easier to use, so we used that instead. Originally we had wanted to find the balls using a resolution of 1600x1200, but this was too slow, so we ended up using 960x720. III. I MPLEMENTATION A. Hough Transform The Hough transform is a mathematical technique used in image processing to isolate or locate certain shapes in an image. This transform is classically used to find geometric shapes such as squares, lines, circles, ellipses etc. however there is a generalized Hough transform which can be specified to search for arbitrary shapes. In either situation, the transform works by finding a series of points of interest, which changes based on the implementation, and searching the remaining linear spaces for possible solutions. 1) Hough Transform for Lines: When the Hough transform is used to find lines it will need to find a series of points of interest, (xi , yi ) and search all possible curves that this point can create given some limiting parameters. In order to find these (xi , yi ) points, generally a canny filter is applied to the image to highlight possible edges. By using the canny representation of an image we need to try and find a series of enough points which are collinear and may represent a line. If we search in the polar coordinate system, where a line is defined as: x ∗ cos(Θ) + y ∗ sin(Θ) = r. The transform will create a permutation of all possible lines that can be created in this method, given certain discretizing parameters such as minimum length, minimum step for Θ and r. However, varying values of r and Θ are represented by sinusoids. By plotting these points in r, Θ space, which is a series of sinusoids, it becomes possible to find points which are collinear in Cartesian space by finding intersections of the sinusoids in (r, Θ) space. To find the possibles X,Y points that may be considered as lines the transform first runs the image through a canny filter which makes it more obvious what the lines may be. 2) Hough Transform for Circles: Much like the Hough transform for lines, the Hough transform for circles attempts to vary the possible parameters that would represent a circle. These parameters are the radius of the circle and its center in two dimensions. This provides three dimensions to search over instead of two which was required for the lines, so it takes an order of time longer to run the algorithm. The same as the line finder we apply some canny filtering to the image before we search for images. However, by drawing possible circles along points of interest we can identify possible circles by finding intersections of sufficient number of circles in our search space. This point of intersection will represent the center of a circle of the radius that was being searched for at that point in time. Using the hough transform for circles can search for arbitrary sized circles, however for our application we have a specific goal to accomplish and have limited the parameters to only search for balls of the radius of a standard pool ball with a small offset δ1 . B. Color Histograms In order to determine the number of each pool ball we attempt to identify the color of the ball. Assuming we have successfully found the locations of the pool balls to begin with, we can search the circle where the center is identified through the image detection and the ball radius is a known calculation based on the resolution. We will evaluate each pixel within this circle and create a histogram of the colors. This histogram is created by grouping pixels into discretized groups, which is inherit based on the fact that the 8-bit RGB values correspond to 256 different groups. To increase accuracy we have played with the size of the bucket and have settled on grouping each R, G, or B value into 16 discrete groups, which effectively evaluates to dropping the 4 least significant bits of each RGB value. After the histogram of the ball is created, it is compared to a histogram that was previously created under controlled circumstances where the ball numbers are known, and whichever histogram best matches the histogram that we just measured is returned as the ball number. We repeat this process for each ball. This algorithm will run on the order of the number of pixels that represent a ball. C. Error Calculation When calculating the error, we are only concerned with the error perpendicular to the shot direction. The error comes from the accuracy of the cameras. With one pixel per mm, we can only expect to be accurate to one mm. The image detection will also add additional error. In Figure 3, this error is e. From our observations so far, the maximum e is 4 pixels. Since the angle of the suggested shot will not change, the player will shoot the cue ball (white) at the same angle that the system suggests. This means the cue ball will strike the object ball (red) at an offset from where the software expected, causing a deviation from the expected path. In Figure 4, we show how this offset will affect the direction of the object ball. If the cue ball is striking the object ball at an offset of e, we can draw an identical triangle on the other side of the object ball and extend it. For example, if the object ball is struck at an offset of e and is traveling about half the length of the table (≈ 950mm), we extend the triangle out. Since 950 ≈ 33 ∗ r, the object ball will be 33∗e away from where it was supposed to be at the end of the shot. With e = 1mm, the object ball 5 a b Fig. 5. l c e w Fig. 3. Blue arrow is the intended shot. Red arrow is the actual shot. a ≈ 57mm, b ≈ 142mm, c ≈ 117mm, l ≈ 1900mm, w ≈ 952mm PDF of a single ball will have an error of 33mm. With e = 4mm, the shot will have an error of 4 ∗ 33 = 132mm. The bulk of our error will be in detecting the center of the balls. We will assume for now that the error is uniformly distributed in a circle of radius emax centered at the true center of the ball. Then the probability density function will be 1 x2 + y 2 < e2max πe2max (3) fx,y (x, y) = 0 otherwise However, we only care about the error perpendicular to the direction of the shot. This is the error which will cause offset when the balls collide, causing error in the path of the object ball. To find the pdf of the error in one direction, we use the equation Z ∞ fx (x) = fx,y (x, y) (4) −∞ e r Using equation 4 with our probability density function we obtain Z √e2max −x2 1 fx (x) = dy √2 2 − emax −x2 πemax 1 p 2 2 = 2 e − x (5) max πe2max x r e Of course fx (x) is only valid where |x| < emax because emax is the maximum amount of error. Since the cue ball and the object ball will have an error in the detected location, the total error will be the difference of the two errors. etotal = e1 − e2 (6) The pdf of the total error will be the convolution of the two separate errors. Since both separate errors have the pdf in Eq 4, the pdf of the total error is Fig. 4. A ball when it is hit fe (e) = fx (x) ∗ fx (x) Z ∞p q 4 2 2 2 emax − τ e2max − (x − τ ) (7) dτ = π 2 e4max −∞ 6 Fig. 6. PDF of the total error A numerical approximation of this integral is shown in Figure 6. Through approximation in Matlab, we have calculated the following statistics: Probability |etotal | < 2 .497 |etotal | < 3 .687 |etotal | < 4 .830 Now let d = distance between object ball and pocket, r = radius of ball, and s = d/r. Then according to Figure 4, the error at the end of the shot will be ef inal = s ∗ etotal . Using the pdf for etotal , Probability 2 |ef inal | < 2 ∗ s = ∗ d r 3 |ef inal | < 3 ∗ s = ∗ d r 4 |ef inal | < 4 ∗ s = ∗ d r So we will be able to correctly make shots from a distance of 712mm with a probability of .497, from a distance of 475mm with a probability of .687, and from a distance of 356mm with a probability of .83. Also note that we believe this to be a minimum estimation. Through fine tuning the image recognition software, we should be able to decrease emax for a single ball from 4 pixels to hopefully 1 pixel. Repeating the previous calculations for emax = 1, 2, 3, we obtain the following statistics for pshot , the probability of correctly executing the shot: d(mm) 356 475 712 1425 emax = 4 emax = 3 emax = 2 emax = 1 pshot pshot pshot pshot .830 .946 .999 .999 .687 .829 .978 .999 .497 .629 .830 .999 .263 .345 .497 .832 (8) .497 .687 D. Requirements Specification Status .830 Following is how well we have met our requirements specification. 7 • • • • • MET - The system suggests shots to the player. MET - The pool cue can communicate with the computer up to 20ft away. MET - The pool cue battery lasts for at least 10 hours. NOT MET - Have a 5 cm accuracy ALMOST MET - Make a suggestion within 5 seconds of when the balls stop moving - it takes between 6 and 7 seconds IV. P ROJECT M ANAGEMENT A. Roles of Team Members We divided the necessary work as evenly as possible. Dave is responsible for the bluetooth module on the cue stick. Dave will also help with the lighting fixture, and when that is finished he will work on the GUI. Doug is mainly responsible for the GUI, although he is also helping with the lighting fixture and the bluetooth module. Tim is responsible for the physics engine and the best shot algorithm. He is also assisting with the image recognition software. Mitch is responsible for the image recognition software and the webpage. He is also helping with the physics, best shot algorithm, and lighting fixture. B. Gantt Chart Please refer to Figure 7 on page 8. This chart shows all the tasks we have identified for this project, when we expect to start each task, and how long we expect each task to take to complete. V. S UMMARY AND C ONCLUSION We are pleased with the outcome of our project. Although it is not perfect, we feel that we have learned and accomplished a lot. R EFERENCES [1] “Opencv: Open computer vision library,” http://opencv.willowgarage.com/wiki. [2] M. Smith, “Pickpocket: An artificial intelligence for computer billiards,” Master’s thesis, University of Alberta, 2006. [3] W. Leckie and M. Greenspan, “An event-based pool physics simulator,” Eleventh Advances in Computer Games: Lecture Notes on Computer Science, no. 4250, pp. 247–262, Sep. 2006, http://rcvlab.ece.queensu.ca/ greensm/downloads.html. 8 Fig. 7. Gantt Chart