
Introduction to Data Engineering Course Overview
Explore the Data Management course covering data processing operations like retrieval, cleaning, transformation, validation, catalogization, and more. Dive into data formats, models, schemas, encryption, compression, and indexing. Discover Bachelor specialization in Databases and Web, along with seminars and lectures on related topics.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Organization NDBI046 NDBI046 - - Introduction to Data Engineering Introduction to Data Engineering 202 2024/2025 4/2025 Petr Petr koda koda https://github.com/skodapetr https://github.com/skodapetr https://www.ksi.mff.cuni.cz https://www.ksi.mff.cuni.cz
NDBI046 Annotation The goal of the Data Management course is to give an overview of commonly used operations and techniques in a typical data processing process. This includes data retrieval, cleaning, transformation, validation, catalogization, versioning, documentation, publication via API, integration, search, compression, encryption, and working with large and distributed data. 2
Bachelor specialization: Databases and Web 2. Data Management Data formats. Data models for structured data, use-cases. Graph, hierarchical, tabular, and geodata data formats. Data schemas and data transformation languages. Basics of graphics, multimedia and print formats. Data vocabulary, data semantics. Data transformation, catalogization and metadata. Basics of data encryption and compression. Basics of indexing. File organization techniques, direct/indirect indexing, primary/secondary index. Hashing in external memory. Hierarchical indexing, indexing for spatial databases, spatial join, spatial query. Bachelor State Final Exam Topics Data transformation Data catalogization and metadata Data semantics, data vocabularies Basics of data encryption and compression 3
Seminars Pavel Koupil / Petr koda Prerequisites: Data Formats (NPRG036) SQL Linux Python Hybrid format Video tutorials Consultations / Demonstrations Invited lecture! See website for more information! 4
Lectures End one week sooner! See https://www.ksi.mff.cuni.cz/teaching/ndbi046-web/ for more details. Business perspective Data Warehouse, Data Lake, Data Management Data Catalogs Martin Ne ask Vocabularies and Ontologies Cryptography and Certificates Information Theory Text Search 5
Final test Written exam. You can get advantage by completing the seminar assignments. 6