Introduction to Scala: XML Processing and Beyond

Introduction to Scala: XML Processing and Beyond
Slide Note
Embed
Share

Scala is a powerful programming language utilized by major companies like LinkedIn, Apple, and Twitter. This presentation delves into XML processing and the diverse applications of Scala, showcasing its functionalities, modular nature, and remarkable performance. Explore the world of Scala, its uses, benefits, and why developers are choosing it for their projects. Discover the concise, functional, and type-safe features that make Scala an ideal choice for robust and efficient software development projects.

  • Scala
  • XML processing
  • Functional programming
  • Modular
  • Type-safe

Uploaded on Apr 12, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. XML Processing in William Narmontas Dino Fancellu www.scala.contractors XML LONDON 2014

  2. Dino Fancellu 35 years IT Scala Java XML William Narmontas 10 years IT Scala XML Web

  3. What is Scala?

  4. Scala processes XML fast

  5. It is powerful

  6. Modular Concise Functional Type-safe Performant Object-oriented Strongly-typed Statically-typed Unopinionated Composable Java-interoperable First-class XML

  7. Who uses Scala? LinkedIn The Guardian Apple eBay Morgan Stanley TomTom Bank of America eHarmony Netflix Trafigura Barclays EDF Novell Tumblr BBC FourSquare Rackspace Twitter BSkyB Gawker Sky UBS Cisco HSBC Sony VMware Citigroup ITV Springer Xerox Credit Suisse Klout

  8. Projects in Scala - Less code to write = less to maintain - Communication clearer - Testing easier - Software robust - Time to market: fast - Happier developers

  9. Scala language: Intro

  10. Values Scala val conferenceName = "XML London 2014" XQuery let $conferenceName := "XML London 2014" Scala (Mutable) var conferenceName = "XML London 2014" conferenceName = "XML London 2015"

  11. Strings val language = "Scala" s"XML Processing in $language" | XML Processing in Scala s"""An introduction to: |The "$language" programming language""".stripMargin | An introduction to: | The "Scala" programming language s"$language has ${language.length} chars in its name" | Scala has 5 chars in its name

  12. Functions Scala def fun(x: Int, y: Double) = s"$x: $y" XQuery declare function local:fun( $x as xs:integer, $y as xs:double ) as xs:string { concat($x, ": ", $y) };

  13. Everything is an expression val trainSpeed = if ( train.speed.mph >= 60 ) "Fast" else "Slow" def divide(numerator: Int, denominator: Int) = try { s"${numerator/denominator}" } catch { case _: java.lang.ArithmeticException => s"Cannot divide $numerator by $denominator" }

  14. Types: Explicit def withTitle(name: String, title: String): String = s"$title. $name" val x: Int = { val y = 1000 100 + y } | x: Int = 1100

  15. Functions: named parameters Further clarity in method calls: def makeLink(url: String, text: String) = s"""<a href="$url">$text</a>""" makeLink(text = "XML London 2014", url = "http://www.xmllondon.com") | <a href="http://www.xmllondon.com">XML London 2014</a>

  16. Functions: default parameters Reduce repetition in method calls: def withTitle(name: String, title: String = "Mr") = s"$title. $name" withTitle("John Smith") | Mr. John Smith withTitle("Mary Smith", "Miss") | Miss. Mary Smith

  17. Functional def incrementedByOne(x: Int) = x + 1 (1 to 5).map(incrementedByOne) | Vector(2, 3, 4, 5, 6)

  18. Lambdas (1 to 5).map(x => x + 1) | Vector(2, 3, 4, 5, 6) (1 to 5).map(_ + 1) | Vector(2, 3, 4, 5, 6)

  19. For comprehensions for { x <- (1 to 5) } yield x + 1 | Vector(2, 3, 4, 5, 6)

  20. Implicit classes: Enrich types implicit class stringWrapper(str: String) { def wrapWithParens = s"($str)" } "Text".wrapWithParens | (Text)

  21. Powerful features for scalability - Case classes - Traits - Partial functions - Pattern matching - Implicits - Flexible Syntax - Generics - User defined operators - Call-by-name - Macros

  22. Scala & XML

  23. Values: Inline XML val url = "http://www.xmllondon.com" val title = "XML London 2014" val xmlTree = <div> <p>Welcome to <a href={url}>{title}</a>!</p> </div> | xmlTree: scala.xml.Elem = | <div> | <p>Welcome to <a href="http://www.xmllondon.com/">XML London 2014</a>!</p> | </div>

  24. XML Lookups val listOfPeople = <people> <person>Fred</person> <person>Ron</person> <person>Nigel</person> </people> listOfPeople \ "person" | NodeSeq(<person>Fred</person>, <person>Ron</person>, <person>Nigel</person>) listOfPeople \ "_" | NodeSeq(<person>Fred</person>, <person>Ron</person>, <person>Nigel</person>)

  25. XML Lookups val fact = <fact type="universal"> <variable>A</variable> = <variable>A</variable> </fact> fact \\ "variable" | NodeSeq(<variable>A</variable>, <variable>A</variable>) fact \ "@type" | : scala.xml.NodeSeq = universal fact \@ "type" | : String = universal

  26. XML Loading val pun = """<pun rating="extreme"> | <question>Why do CompSci students need glasses?</question> | <answer>To C#<!-- C# is a Microsoft's programming language -->.</answer> |</pun>""".stripMargin scala.xml.XML.loadString(pun) | <pun rating="extreme"> | <question>Why do CompSci students need glasses?</question> | <answer>To C#.</answer> | </pun>

  27. Collections: expressive val root = <numbers> {for {i <- 1 to 10} yield <number>{i}</number>} </numbers> val numbers = root \ "number" numbers(0) | <number>1</number> numbers.head | <number>1</number> numbers.last | <number>10</number> numbers take 3 | NodeSeq(<number>1</number>, <number>2</number>, <number>3</number>)

  28. Collections: expressive numbers filter (_.text.toInt > 6) | NodeSeq(<number>7</number>, <number>8</number>, <number>9</number>, <number>10</number>) numbers(_.text.toInt > 6) | NodeSeq(<number>7</number>, <number>8</number>, <number>9</number>, <number>10</number>) numbers maxBy (_.text) | <number>9</number> numbers maxBy (_.text.toInt) | <number>10</number> numbers.reverse | NodeSeq(<number>10</number>, <number>9</number>, <number>8</number>, <number>7</number>, <number>6</number>, <number>5</number>, <number>4</number>, <number>3</number>, <number>2</number>, <number>1</number>) numbers.groupBy(_.text.toInt % 3) | Map( | 2 -> NodeSeq(<number>2</number>, <number>5</number>, <number>8</number>), | 1 -> NodeSeq(<number>1</number>, <number>4</number>, <number>7</number>, <number>10</number>), | 0 -> NodeSeq(<number>3</number>, <number>6</number>, <number>9</number>))

  29. XML Methods: a rich API ++ :\ andThen buildString companion copyToBuffer distinct endsWith flatten genericBuilder headOption inits isTraversableAgain lastIndexWhere max nameToString par product reduceRightOption sameElements seq sorted stringPrefix takeWhile toIndexedSeq toSet union xmlType zipWithIndex ++: \ apply canEqual compose corresponds doCollectNamespaces exists fold getNamespace indexOf intersect iterator lastOption maxBy namespace partition reduce repr scan size span sum text toIterable toStream unzip xml_!= +: \@ applyOrElse child contains count doTransform filter foldLeft groupBy indexOfSlice isAtom label length min nonEmpty patch reduceLeft reverse scanLeft slice splitAt tail theSeq toIterator toString unzip3 xml_== /: \\ asInstanceOf collect containsSlice descendant drop filterNot foldRight grouped indexWhere isDefinedAt last lengthCompare minBy nonEmptyChildren permutations reduceLeftOption reverseIterator scanRight sliding startsWith tails to toList toTraversable updated xml_sameElements /:\ addString attribute collectFirst copy descendant_or_self dropRight find forall hasDefiniteSize indices isEmpty lastIndexOf lift minimizeEmpty orElse prefix reduceOption reverseMap scope sortBy strict_!= take toArray toMap toVector view zip % :+ aggregate attributes combinations copyToArray diff dropWhile flatMap foreach head init isInstanceOf lastIndexOfSlice map mkString padTo prefixLength reduceRight runWith segmentLength sortWith strict_== takeRight toBuffer toSeq transpose withFilter zipAll

  30. For-comprehensions: similar to XQuery <bib>{ <bib>{ for { for $b in $xml/book b <- xml \ "book" let $year := $b/@year year = b \@ "year" where $b/publisher = "Addison-Wesley" and if b \ "publisher" === "Addison-Wesley" && $year > 1991 year > 1991 return <book year="{ $year }"> } yield <book year={ year }> { $b/title } { b \ "title" } </book> </book> }</bib> }</bib>

  31. For-comprehensions: similar to XQuery <bib>{ <bib>{ for { for $b in $xml/book b <- xml \ "book" let $year := $b/@year year = b \@ "year" where $b/publisher = "Addison-Wesley" and if b \ "publisher" === "Addison-Wesley" && $year > 1991 year > 1991 return <book year="{ $year }"> } yield <book year={ year }> { $b/title } { b \ "title" } </book> </book> }</bib> }</bib>

  32. For-comprehensions: similar to XQuery <bib>{ <bib>{ for { for $b in $xml/book b <- xml \ "book" let $year := $b/@year year = b \@ "year" where $b/publisher = "Addison-Wesley" and if b \ "publisher" === "Addison-Wesley" && $year > 1991 year > 1991 return <book year="{ $year }"> } yield <book year={ year }> { $b/title } { b \ "title" } </book> </book> }</bib> }</bib>

  33. For-comprehensions: similar to XQuery <bib>{ <bib>{ for { for $b in $xml/book b <- xml \ "book" let $year := $b/@year year = b \@ "year" where $b/publisher = "Addison-Wesley" and if b \ "publisher" === "Addison-Wesley" && $year > 1991 year > 1991 return <book year="{ $year }"> } yield <book year={ year }> { $b/title } { b \ "title" } </book> </book> }</bib> }</bib>

  34. For-comprehensions: similar to XQuery <bib>{ <bib>{ for { for $b in $xml/book b <- xml \ "book" let $year := $b/@year year = b \@ "year" where $b/publisher = "Addison-Wesley" and if b \ "publisher" === "Addison-Wesley" && $year > 1991 year > 1991 return <book year="{ $year }"> } yield <book year={ year }> { $b/title } { b \ "title" } </book> </book> }</bib> }</bib>

  35. For-comprehensions: similar to XQuery <bib>{ <bib>{ for { for $b in $xml/book b <- xml \ "book" let $year := $b/@year year = b \@ "year" where $b/publisher = "Addison-Wesley" and if b \ "publisher" === "Addison-Wesley" && $year > 1991 year > 1991 return <book year="{ $year }"> } yield <book year={ year }> { $b/title } { b \ "title" } </book> </book> }</bib> }</bib> Nice! ... yet is general purpose

  36. Hybrid XML - XQuery for Scala - java.xml.* for free - Look up: XPath - Transform: XSLT - Stream: StAX

  37. XQuery for Scala (XQS) - Wraps XQuery API for Java (javax.xml.xquery) - Scala access to XQuery in: - MarkLogic, BaseX, Saxon, Sedna, eXist, - Converts DOM to Scala XML & vice versa - http://github.com/fancellu/xqs

  38. XQuery via XQS val widgets = <widgets> <widget>Menu</widget> <widget>Status bar</widget> <widget id="panel-1">Panel</widget> <widget id="panel-2">Panel</widget> </widgets> import com.felstar.xqs.XQS._ val conn = new net.xqj.basex.local.BaseXXQDataSource().getConnection val nodes: NodeSeq = conn("for $w in /widgets/widget order by $w return $w", widgets) | NodeSeq(<widget>Menu</widget>, <widget id="panel-1">Panel</widget>, | <widget id="panel-2">Panel</widget>, <widget>Status bar</widget>)

  39. XPath import com.felstar.xqs.XQS._ val widgets = <widgets> <widget>Menu</widget> <widget>Status bar</widget> <widget id="panel-1">Panel</widget> <widget id="panel-2">Panel</widget> </widgets> val xpath = XPathFactory.newInstance().newXPath() val nodes = xpath.evaluate("/widgets/widget[not(@id)]", toDom(widgets), XPathConstants.NODESET).asInstanceOf[NodeList] (nodes: NodeSeq) | NodeSeq(<widget>Menu</widget>, <widget>Status bar</widget>) Natively in Scala: (widgets \ "widget")(widget => (widget \ "@id").isEmpty) | NodeSeq(<widget>Menu</widget>, <widget>Status bar</widget>)

  40. XSLT val stylesheet = <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"> <xsl:template match="john"> <xsl:copy>Hello, John.</xsl:copy> </xsl:template> <xsl:template match="node()|@*"> <xsl:copy> <xsl:apply-templates select="node()|@*"/> </xsl:copy> </xsl:template> </xsl:stylesheet> import com.felstar.xqs.XQS._ val xmlResultResource = new java.io.StringWriter() val xmlTransformer = TransformerFactory.newInstance().newTransformer(stylesheet) xmlTransformer.transform(peopleXml, new StreamResult(xmlResultResource)) xmlResultResource.getBuffer | <?xml version="1.0" encoding="UTF-8"?><people> | <john>Hello, John.</john> | <smith>Smith is here.</smith> | <another>Hello.</another> | </people> val peopleXml = <people> <john>Hello, John.</john> <smith>Smith is here.</smith> <another>Hello.</another> </people>

  41. XML Stream Processing // 4GB file, comes back in a second val src = Source.fromURL("http://dumps.wikimedia.org/enwiki/20140402/enwiki-20140402-abstract.xml") val er = XMLInputFactory.newInstance().createXMLEventReader(src.reader) implicit class XMLEventIterator(ev:XMLEventReader) extends scala.collection.Iterator[XMLEvent]{ def hasNext = ev.hasNext | 1: | 2: | | 3: | 4: | | 5: | 6: | 7: | 8: | | 9: | 10: <feed> def next = ev.nextEvent() } <doc> er.dropWhile(!_.isStartElement).take(10).zipWithIndex.foreach { <title> Wikipedia: Anarchism </title> case (ev, idx) => println(s"${idx+1}:\t$ev") } src.close() <url> http://en.wikipedia.org/wiki/An archism

  42. Use Cases - Data extraction - Serving XML via REST - Dynamically generated XSLT - Interfacing with XML databases - Flexibility to choose the best tool for the job

  43. Excellent Ecosystem SBT Akka Spark Spray Specs scalaz scala-xml shapeless Scaladin ScalaTest macro-paradise scala-maven-plugin JVM

  44. Conclusion - Practical - Practical for XML processing

  45. Where do I start? - atomicscala.com - typesafe.com/activator - scala-lang.org - scala-ide.org - IntelliJ

  46. Matt Stephens Charles Foster

  47. Open to consulting www.scala.contractors Follow us on Twitter: @DinoFancellu @ScalaWilliam @MaffStephens

More Related Content