
Unlocking the Power of VTL Engine for Data Processing
Explore the capabilities of VTL Engine in Amsterdam's SDMX Experts Workshop, focusing on syntax validation, semantic analysis, and data structure compatibility, with added features like vtlPlayground and PySDMX. Learn how the engine processes data efficiently with automated pipelines, user-friendly UI, and compatibility with REST APIs.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Making your data worthwhile. VTL Engine Amsterdam, 9th October 2024 SDMX Experts Workshop
The context: The VTL Suite VTL as a product 01 Business users Automatedpipelines Compatibility (ex: REST) Performance Stability Business continuity Friendly UI Syntax highlight VTL validation ... Our value added components vtlPlayground vtlManager PySDMX vtlEngine Our contribution to the community
VTL Engine main features 02 Full implementation of VTL 2.0 Exhaustive semantic analysis 2800+ unit tests (~90% Code Cov) Single thread execution Used in Production Based on Pandas and DuckDB
03 What does the engine do? C := round(D, 0); B := C [keep var1]; A := B [filter var1 > 10]; 1 Syntax validation No errors! Transformation graph generation 2 D C B A
04 C := round(D, 0); B := C [keep var1]; A := B [filter var1 > 10]; Semantic validation 3 Graph validation: Ok Data types validation: OK Data structures compatibility: OK Calculation of derived Data structures DataStructure dataset D Code role dataType Id1 Identifier Integer var1 Measure Number var2 Measure Number
05 DataSet D var1 15,2 9,1 17,3 8,2 1,5 C := round(D, 0); B := C [keep var1]; A := B [filter var1 > 10]; id1 1 2 3 4 5 var2 100,15 150,25 200,12 250,19 300,43 Interpreter 4 DataStructure dataset C Code role DataStructure dataset B Code role DataStructure dataset A code role dataType dataType dataType Id1 var1 var2 Identifier Measure Measure Integer Number Number Id1 Identifier Integer id1 var1 Identifier Integer Measure Number var1 Measure Number id1 1 2 3 4 5 var1 15 9 17 8 1 var2 100 150 200 250 300 id1 1 2 3 4 5 var1 15 9 17 8 1 id1 1 3 var1 15 17 A := B[filter var1>10] B := C[keep var1] C := round(D, 0)
Simple API to validate VTL scripts 06 The semantic analysis takes all needed elements to validate the VTL script used The result is the computed datasets with no datapoints
Simple API to run VTL scripts 07 Like the semantic_analysis, the run function will execute the VTL script using datapoints. S3 URIs are supported Memory efficiency if the user specifies paths for loading and saving datapoints Supports any Pandas-compatible format and SDMX data messages semantic_analysis and run can be executed independently
Performance (Semantic Analysis) 08 The Semantic Analysis can be included in any Web Application. Even on the largest script we have (183 transformations), it takes less than 2 seconds.
Performance (Run) 09 Designed to be fast on small and medium sized files Majority of use cases Better maintainability and resource allocation forecast Support for large executions (if they fit in memory) Single thread execution: Removing overhead on internal process synchronization Easier to predict behavior on large executions Cost-effective data pipelines
Next steps 10 Support to VTL version 2.1 soon! Parallel execution / Spark R&D 2025 Support to SDMX-ML VTL Artifacts
Making your data worthwhile. meaningfuldata.eu Antonio Olleros Founder & CEO +34 645 89 16 57 antonio.olleros@meaningfuldata.eu