How I learned to love XML! (FME Certified Professional, Peter
Transcription
How I learned to love XML! (FME Certified Professional, Peter
How I learned to ‘love’ XML Peter Laulund National Survey and Cadastre Agenda • KMS and INSPIRE • About XML/GML • Writing XML with FME – Templates – Schema mapping, semantic – Schema mapping, geometry – Workflow – design INSPIRE • INSPIRE is a European initiative to create a common SDI • Specific datasets has to be available in a harmonized way • KMS is the National contact point • KMS has five Annex 1 datasets • I will talk about XML not INSPIRE KMS datasets • • • • • • Transport Networks Hydrography Cadastral Parcels Geographical Names Administrative Units Documentation - http://inspire.jrc.ec.europa.eu Warning: getting through the PDF and the related XSD(s) is a tough read ! Inspire-foss FME and XML XML • XML: eXtensible Markup Language – Syntax used to describe data • GML: Geography Markup Language – XML dialect for describing geography – GML 2, GML 3.1.1, GMLSF, GML 3.2.1 • XSD: XML Schema Definition – XML dialect for describing the contend of xml files XML - example <?xml version="1.0" encoding="UTF-8"?> <!-- oprettet af Pel, kms, 3. august 2012--> <venner xmlns:p="http://www.kms.dk/xmlschmas" xmlns:d="http://www.kms.dk/xmlschmas"> <p:person id="345"> <p:navn>Peter Laulund</p:navn> <p:adresse>Sognegårds alle 54</p:adresse> <p:født> <d:dato> <d:dag>3</d:dag> <d:måned>maj</d:måned> <d:år>1957</d:år> <d:klokken/> </d:dato> </p:født> <p:telefon type="fastnet">+45 36499408</p:telefon> <p:telefon type="mobil">+45 26273031</p:telefon> <p:giftMed href="#445"/> <p:arbjedsgiver/> </p:person> </venner> FME and XML • • • • • • FME reads and writes XML/GML Converts geometry to gml XMLSampleGenerator XMLTemplater XMLValidator XMLFormater XMLTemplater • An XML template is an XML document with XQuery functions <gn:text>{fme:get-attribute("name")}</gn:text> <au:geometry> {fme:get-xml-attribute("gml_geom")} </au:geometry> {fme:get-xml-list-attribute("level{}.xml")} <gml:featureMembers> {fme:process-features("FEATURE")} </gml:featureMembers> XMLTemplater • The document may be loaded from – an attribute – a file – entered into the transformer • We use a file that is loaded into an attribute Templates • Use the XMLSampelGenerator to create the template • Edit the template in a text editor – Delete – Add XQuery function calls • Use XMLValidator to evaluate the result • Use XMLFormater to make it look pretty Writing INSPIRE GML Challenges • Five datasets some with more than one feature type • Data for download from ftp don with FME • WFS with Snowflake • WMS with ? • All data in one Oracle database Dataflow in KMS Oracle *.GML Read Write Schema Mapping - semantik Schema Mapping - geometry Transform to xml Sql FME tools Tcl Aggregate SetTraits OGCGeometry XMLTemplate XMLValidate XMLFormater Schema mapping Schema mapping are basic FME functionality • • • • Add or remove attributes Change feature types Alter domain values All our data are in an Oracle database we will therefore use sql for schema mapping Schema mapping F eatureT yp e REGION REGION REGION REGION REGION REGION REGION : : : : : : : : oldAttribute REGIONKODE DAGI_ID FEAT_ID TIMEOF_CRE FEAT_TYPE REGIONNAVN DQ_RESPONS : : : : : : : : n e w A t t r i b ut e nationalCode localId localIdGeom beginLifespanVersion nationalLevelName name sourceOfName F eatureT yp e REGION REGION REGION REGION REGION : : : : : : newAttribute namespace namespaceGeom country gmlTemplate nationalLevel : : : : : : value dk.kms.au dk.kms.au.geom DK AdministrativeUnit 2ndOrder Schema mapping Europavej Primærvej Sekundærvej Anden vigtig vej Større lokalvej Lokalvej Indkørselsvej Anden vej Read Hovedsti Cykelsti langs vej Sti, diverse Trafikvej-Gennemfart Trafikvej-Fordeling Lokalvej-Primær Lokalvej-Sekundær Lokalvej-Tertiær Ikke tildelt ? mainRoad firstClass secondClass thirdClass fourthClass fifthClass sixthClass seventhClass eighthClass ninthClass Write inspireId All features must have an inspireId, it is a complex type made of • A namespace - <country>.<organisation>.<dataset> • Id – the features database id • Version – null, sequence or timestamp <cp:inspireId> <base:Identifier> <base:localId> {fme:get-attribute("localId")} </base:localId> <base:namespace> {fme:get-attribute("namespace")} </base:namespace> <base:versionId> {fme:get-attribute("beginLifespanVersion")} </base:versionId> </base:Identifier> </cp:inspireId> <base:Identifier> <base:localId>595944</base:localId> <base:namespace>dk.kms.tn.roadnode</base:namespace> <base:versionId>2012-03-13T18:09:26</base:versionId> </base:Identifier> Example - geometry Coordinate System: ` EPSG:25832' Geometry Type: IFMEPoint Number of Geometry Traits: 1 GeometryTrait(string): `gml_id' has value `dk.kms.tn.roadnode.594897.20120803145439' Coordinate Dimension: 3 (725261.96,6187842.58,2.5) @GMLGeometry(TO_ATTRIBUTE, GML_3.2.1, gml_geom) <net:geometry> {fme:get-xml-attribute("gml_geom")} </net:geometry> <net:geometry> <gml:Point gml:id="dk.kms.tn.roadnode.594897.20120803145439” srsName="EPSG:25832" srsDimension="3"> <gml:pos>725261.96 6187842.58 2.5</gml:pos> </gml:Point> </net:geometry> gml:id • • • • • gmi:id is mandatory Unique within the document Must start with a letter In FME default is an UUID Build the same way as inspireId – <namespace>.<id>.<timestamp> gml:id NAVNE - ID - FraDato - Navn X MONTAGE - ID - FraDato - Geometri <gn:geometry> <gml:MultiCurve gml:id="dk.kms.gn.114294-0" srsName="EPSG:25832" srsDimension="2"> <gml:curveMember> <gml:LineString gml:id="dk.kms.gn.geom.24795424.20080613T085805"> <gml:posList>723071.8 6194337.45 .... </gml:LineString> </gml:curveMember> <gml:curveMember> <gml:LineString gml:id="dk.kms.gn.geom.24793173.20080613T085805"> <gml:posList>723273.34 6194566.81 ..... </gml:LineString> </gml:curveMember> ....... </gn:geometry> FME script design Design When we are designing an FME script we should reflect on • Schema mapping – FME or database • Generic or specific • Design of dataflow – design patterns • Pre- and post processing • Existing system architecture Dataflow design Dataflow design Dataflow design Conclusion design • Testing shows that design #3 is best – Design #1 will not work with big datasets because of the list – It use 10 to 15 percent less memory than #2 – The timing is identical to #2 – We can validate individual features Conclusion • After the first tests it been easy to work with templates • Schema mapping in Oracle with sql • A generic solution based on Design #3 • Templates as individual files read into an attribute • Only problem is FME can’t handle big datasets (+60.000 features) –yet Questions? Peter Laulund Rentemestervej 8 DK-2400 Copenhagen NV Denmark Phone: +45 72 54 51 73 E-mail: pelau@kms.dk