WG4 PatMedia Description
Transcription
WG4 PatMedia Description
PatMedia - Patent Image Retrieval Stefanos Vrochidis Centre for Research and Technology Hellas Informatics and Telematics Institute MUMIA WG4: Semantic Search, Faceted Search and Visualization Vienna, 9 June 2011 CERTH-ITI Multimedia Group • Personnel • 25 people (researchers, (researchers developers developers, administration) • Participation in European and national research projects: • • FP7: WeKnowIt (coordination), JUMAS, CHORUS+, PESCaDO FP6: AceMedia, AceMedia X-Media, X Media MESH, MESH BOEMIE, BOEMIE VIDI-Video, VIDI Video K-Space, PATExpert, ELU, etc. • Contracts with Industryy • 30 Journal publications, 100+ conference publications, 18 book chapters, 3 patents • Numerous events: ACM CIVR09, WWW09 tutorial, WIAMIS 2007, etc. Centre for Research and Technology Hellas Informatics and Telematics Institute 2 Expertise Relevant to WP4 Main research interests • Semantics • • Image & Video analysis • • feature extraction, content classification, concept extraction, event detection Image & Video Retrieval • • ontology design, reasoning Search engines, multimodal retrieval P t t IImage R Patent Retrieval ti l Centre for Research and Technology Hellas Informatics and Telematics Institute 3 PatMedia • Patent Image Retrieval system • Query by visual example was presented in IRFS 2008 • Evaluated by professional patent searchers in IRFS 2010 p extraction presented p in IRFS 2011 • Concept Centre for Research and Technology Hellas Informatics and Telematics Institute 4 Introduction • • Why patent images are important • Figures, drawings and diagrams are almost always patents, as a means to further specify p y the contained in p objects and ideas to be patented. • Image examination is important to patent experts in their attempt to deeply understand the patent contents and find relevant inventions and prior art Advantages g • • Images are the same in every language Problems • Low quality, handwritten labels • Complex black and white images Centre for Research and Technology Hellas Informatics and Telematics Institute 5 Overview • PatMedia Framework • P Patent page segmentation i into i figures fi • Figure textual descriptions extraction and matching to corresponding images • Visual Feature extraction from patent images • PatMedia Retrieval Engine • Image Retrieval by visual similarity • Textual search on figure captions • Concept based Retrieval based on visual and textual features Centre for Research and Technology Hellas Informatics and Telematics Institute 6 PatMedia • Page g Segmentation g to single g figures g Frequently, a patent drawing page contains more than one figure. An image segmentation step that splits a multi - figure page to single images has to be applied. • • • • Each separate figure g in patents is accompanied d by b a label l b l off the h form f “Figure x”. OCR for Figure Label Detection. This task is particularly challenging in certain cases: • • • a single figure consists of spatially disjoint elements or when multiple figures are adjacent. Figure Labels are hand written. written Centre for Research and Technology Hellas Informatics and Telematics Institute 7 Visual Features Binary CBIR • A special case of Content Based Image Retrieval Patent images are mostly binary (black and white) images • • • th d they do nott contain t i any color l or texture t t iinformation f ti Use shape-based feature vector extraction methods • • describe the image geometric information accurately Patent databases contain a vast amount of patents (and so patent drawings) • • Feature vector extraction and retrieval techniques need to be computational inexpensive Extraction of Adaptive Hierarchical Density Histograms (AHDH) as visual feature vectors [1] • • • Global visual features based on the pixel distribution of a d drawing i Low dimension feature vector (~100 features) [1] P. Sidiropoulos, S. Vrochidis, I. Kompatsiaris, "Content-Based Binary Image Retrieval using the Adaptive Hierarchical Density Histogram Histogram",, Pattern Recognition Journal, Elsevier, Volume 44, Issue 4, pp 739739 750, April 2011. Centre for Research and Technology Hellas Informatics and Telematics Institute 8 Textual Features Textual information extraction • Example: Patent US 20020152637 A1 FIG. 7 shows the reversible tongue containing a pocket in its upper half, and which may be secured by Velcro, or the like, into closure Bag of words technique • Indexing g textual information using g Lemur [2] [ ] Stemming using Porter stemmer Creation of Lexicon using training data Feature vector includes lexicon term weights • • • • • • <boot snowboard illustr tongu footwear heel...> [0 0 0 0.0909091 0 0...] [2] http://www.lemurproject.org/ http://www lemurproject org/ Centre for Research and Technology Hellas Informatics and Telematics Institute 9 Query by Visual Example • • • • The images are ranked based on L1 of the AHDH f t features. The system was tested with more than 300.000 patent images from around 10.000 patents of the optical recording domain Query time was around 5-7seconds. An R-tree structure was used d to speed d up the h retrieval process Centre for Research and Technology Hellas Informatics and Telematics Institute 10 Concept Extraction Framework • Supervised p Machine learning g based framework based on Support Vector Machines • Trained with textual and visual low level features • Requires Manually annotated Training Data Patent Figures Patents Page to figure P fi segmentation Concept annotation Feature Vectors Textuall and T d Visual Features Extraction Trained Model Training Data Training T i i andd testing data separation Training T i i andd testing of Classifier Test Data Centre for Research and Technology Hellas Informatics and Telematics Institute 11 Dataset and Concepts Patent Images • • • A43B IPC sub-class Contain “Parts of Footwear” Concepts* • • • • • • • Cleat Ski boot High heel Lacing Closure Spring Heel Tongue * The selection of concepts was done with the help of Dominic DeMarco Centre for Research and Technology Hellas Informatics and Telematics Institute 12 Concepts Cleat • • • Description: A short piece of rubber, metal etc attached to the bottom of a sports shoe used mainly for preventing someone from slipping IPC subclass: A43B5/18S Ski boot • • • Description: p A specially p y made boot that fastens onto a ski IPC subclass: A43B5/04 Centre for Research and Technology Hellas Informatics and Telematics Institute 13 Concepts High Heel • • • Description: i i Sh Shoes with i h high hi h heels h l IPC subclass: A43B21 Lacing closure • • • Description: A cord that is drawn through eyelets or around hooks in order to draw together the two edges of a shoe IPC subclass: A43B5/04 Centre for Research and Technology Hellas Informatics and Telematics Institute 14 Concepts Spring Heel • • • Description: D i i (Heels H l with i h metall springs i IPC subclass: A43B21/30 Tongue • • • Description: The part of a shoe that lies on top of your foot, under the part where you tie it IPC subclass: A43B23/26 Centre for Research and Technology Hellas Informatics and Telematics Institute 15 Dataset Statistics Concept All figures Train Data Test Data Cleat 148 89 59 Ski boot 123 74 49 High heel 148 89 59 Lacing Closure 117 71 46 Spring Heel 106 64 42 Tongue 124 75 49 Total 766 352 304 Creation of training/testing set • • • Testing/Training ratio = 3/5 Positive/Negative ratio = 1/3 Centre for Research and Technology Hellas Informatics and Telematics Institute 16 Approach We trained one classifier for each concept The following cases are considered • • Visual • • SVMs were trained only with visual features (AHDH) Visual extended • • • Extension of visual case The output of all classifiers forms a vector and is passed to a new classifier to yield the final score Textual • • SVMs were trained only with textual features Visual + Textual • • • SVMs were trained with a feature vector containing visual and textual features 200 ffeatures t (100 visual i l and d 100 textual) t t l) Centre for Research and Technology Hellas Informatics and Telematics Institute 17 Max score output Approach Feature Vector Visual Textual Textual + Visual Cleat Classifier Cleat score Cleat score Classifier Ski boot Classifier Ski boot score Ski boot score Classifier High Heel Classifier High Heel score High Heel score Classifier Lacing Closure Classifier Lacing Closure score Lacing Closure score Classifier Spring Heel Classifier Spring Heel score Spring Heel score Classifier Tongue Classifier Tongue score Tongue score Classifier Centre for Research and Technology Hellas Informatics and Telematics Institute Visual extended Max score output 18 Time i for f the h d demos http://mklab-services.iti.gr/patmedia http://mklab-services.iti.gr/patmediac Centre for Research and Technology Hellas Informatics and Telematics Institute 19 Results • Average g Results for all concepts Features Accuracy Precision Recall F-score Visual 85,58% 70,38% 63,78% 60,7% Textual 90,3% 71,59% 71,94% 71,25% Visual ext. 87,22% 65,6% 73,13% 66,81% Visual + Textual 93,25% 82,24% 78,14% 78,58% Centre for Research and Technology Hellas Informatics and Telematics Institute 20 Conclusions • Image search could provide quickly results especially in multilingual patents without much preprocessing • Combination of visual and textual information performs better for concept extraction. • Classification based solely on visual information is still very satisfactory. • Cl Classification ifi i based b d on visual i l features f could ld fail f il when h two visually i ll similar images are described with different concepts. • Automatic segmentation could be supported, however an error (~20%) might be introduced. • The concept retrieval module could be a part of a larger patent retrieval framework. Centre for Research and Technology Hellas Informatics and Telematics Institute 21 Future Work • Further optimize the retrieval function for visual query by example • Produce results for more concepts, bigger datasets and for different IPC classes. classes • Test performance in the case of automatically segmented images (i.e. of lower quality). • Combine more efficiently text and visual information • give more weight to visual or textual description depending on the concept type • Investigate late fusion techniques. Centre for Research and Technology Hellas Informatics and Telematics Institute 22 Relevant Publications and Presentations • S. Vrochidis, "Patent Image Retrieval based on Concept Extraction and Classification", Information Retrieval Facility Symposium 2011 (IRFS 2011), Vienna, Austria, June 7, 2011. • P. Sidiropoulos, p , S. Vrochidis,, I. Kompatsiaris, p , "Content-Based Binaryy Image Retrieval using the Adaptive Hierarchical Density Histogram", Pattern Recognition Journal, Elsevier, Volume 44, Issue 4, pp 739-750, April 2011. • S. Vrochidis, S. Papadopoulos, A. Moumtzidou, P. Sidiropoulos, E. Pianta, I. Kompatsiaris, "Towards Content-based Patent Image Retrieval; A Framework Perspective", Perspective , World Patent Information Journal, Volume 32, Issue 2, pp 94-106, June 2010. • G. Ypma, "Evaluation of Patent Image Retrieval", Information Retrieval Facility Symposium 2010 (IRFS 2010), 2010) Vienna Vienna, Austria, Austria June 1-4, 1 4 2010. 2010 Centre for Research and Technology Hellas Informatics and Telematics Institute 23 Relevant Publications • J. List, "Review of ITI Approach for Searching Non-Text Information in Patents", Information Retrieval Facility Symposium 2010 (IRFS 2010), Vienna, Austria, June 1-4, 2010. • P. Sidiropoulos, p , S. Vrochidis and I. Kompatsiaris, p , "Content-Based Retrieval of Complex Binary Images", in Proceedings of the 8th International Workshop on Content-Based Multimedia Indexing (CBMI 2010), pp. 182-187, Grenoble, France, 23-25 June 2010. • S. Vrochidis, "Towards Patent Image Retrieval", International Patent Information Conference & Exposition, IPI-ConfEx 2009, Venice, Italy, March 1 1-4, 4, 2009. • S. Vrochidis, "Patent Image Retrieval", Information Retrieval Facility Symposium 2008 (IRFS 2008), Vienna, Austria, November 5-7, 2008. • J. Codina, E. Pianta, S. Vrochidis and S. Papadopoulos, "Integration of Semantic, Metadata and Image search engines with a text search engine for patent retrieval", Semantic Search 2008 Workshop, Tenerife, S i June Spain, J 2, 2 2008. 2008 Centre for Research and Technology Hellas Informatics and Telematics Institute 24 Feel free to test the demo! http://mklab services iti gr/patmediac http://mklab-services.iti.gr/patmediac Thank you! http://mklab.iti.gr Centre for Research and Technology Hellas Informatics and Telematics Institute 25