Data Ingestion Overview
MIRROR supports the ingestion of diverse data types critical for building geospatial digital twins. Whether your data is spatial, tabular, or semantic, our platform provides robust CLI tools to handle ingestion seamlessly into the systemβs database.
β Supported Data Types
We currently support ingestion of:
| Data Type | Refined Description |
|---|---|
| Graph Data (AGE) | Node-edge models representing relationships, ingested into Apache AGE (PostgreSQL-based graph DB) |
| IFC CSVs | Building and infrastructure models exported from BIM tools into structured CSVs |
| GIS Vector Formats | Geospatial vector data formats: Shapefile (.shp), GeoJSON, GPKG, etc. |
| Raster Data | Georeferenced imagery formats such as TIFF/GeoTIFF, used for elevation, satellite, heatmaps |
| Tabular CSVs | Generic tabular data (e.g., projects, tasks, budgets) with or without date fields |
| Glossary (CSV) | Searchable glossary of domain-specific terms and acronyms |
Folder Structure Convention
All data must be placed under the root-level folder: ./ingestion_data/.
π Ingestion Commands
Each section below provides copy-paste-ready CLI commands for ingesting data into the platform.
πΌοΈ Ingesting Raster Data (GeoTIFF, TIFF)
Raster datasets are pixel-based images where each pixel holds a value β such as elevation, temperature, or land classification. Twinspace Mirror supports raster ingestion for overlaying high-resolution datasets on the map (e.g., heatmaps, satellite imagery, terrain).
Folder & ZIP Structure
Zip all raster files into one archive before ingesting:
Example Folder Structure
./ingestion_data/
βββ {SITE-NAME}/
βββ ingestion/
βββ rasters/
βββ rasters.zip/
βββ elevation.tif
βββ flood_zones.tif
βββ satellite_view.tif
SITE-NAME with your own site name (e.g., citywest_data, navy_data, etc.) as needed.
π Ingesting GIS Vector Format Data (Shapefile, GeoJSON, GPKG, ZIP)
Mirror supports ingestion of standard spatial vector formats via CLI. These include:
- Shapefile (
.shp) - GeoJSON (
.geojson,.json) - GPKG (
.gpkg) - ZIP archives containing one or more supported files
Shapefiles are a classic GIS vector format consisting of .shp (geometry), .shx (index), .dbf (attributes), and optionally .prj (projection). All components must be zipped before ingestion.
./ingestion_data/
βββ {SITE-NAME}/
βββ ingestion/
βββ spatial/
βββ buildings_data.zip
βββ buildings.shp
βββ buildings.shx
βββ buildings.dbf
βββ buildings.prj
Docker command to Ingest Vector Data ZIP.
π¦ Ingesting GPKG (GeoPackage) Files
GeoPackage (GPKG) is a modern, open standard format that stores both vector and raster spatial data. Itβs more efficient and portable than Shapefiles, and it can hold multiple layers (tables) in a single file.
Twinspace Mirror supports ingestion of GPKG files using the same CLI pattern. Each layer inside the GPKG will be handled individually and converted into corresponding spatial tables.
π Example Folder Structure
./ingestion_data/
βββ {SITE-NAME}/
βββ ingestion/
βββ spatial/
βββ buildings_data.zip
βββ city_buildings.gpkg
Docker command to Ingest GPKG.
π§ Ingesting Knowledge Graphs (Nodes + Edges)
Twinspace Mirror's infrastructure natively supports graph database via Apache AGE and provides CLI tooling to ingest nodes and edges from CSV files to build a knowledge graph representation of your environment (e.g., buildings, assets, relationships, flows).
To support multi-site deployments, you can organize data by site name.
Example Folder Structure
./ingestion_data/
βββ {SITE-NAME}/
βββ ingestion/
βββ graphs/
βββ nodes.csv
βββ edges.csv
SITE-NAME with your own site name (e.g., citywest_data, navy_data, etc.) as needed.
Docker command to Ingest Raster Data.
| Flag | Description |
|---|---|
-i |
Path to the zipped raster files |
-nln |
New Layer Name prefix (ata_) β each file gets its own table/layer |
--overwrite |
(Optional) Use if replacing existing layers |
GeoTIFF Requirement
Make sure your raster files are in GeoTIFF format (with embedded geospatial reference) for proper rendering and querying.
π Graph Ingestion Command
Once your nodes.csv and edges.csv files are placed inside the appropriate siteβs graphs/ folder, run the following command:
Docker command to ingest graph data.
| Flag | Description |
|---|---|
-i |
Path to the folder containing your node and edge CSVs |
-nln |
Prefix used when creating new tables (e.g., zip_alb_nodes, zip_alb_edges) |
--overwrite |
Overwrite existing tables if they already exist |
-kgn |
Graph name used internally in the AGE database (e.g., ntrp) |
ποΈ Ingesting IFC Data (Building Models in CSV Format)
IFC (Industry Foundation Classes) is an open standard for representing building and infrastructure data. It is widely used in Building Information Modeling (BIM) workflows to capture spatial, structural, and semantic aspects of built environments.
While IFC data is typically stored in .ifc or .ifcxml formats, Twinspace Mirror supports direct ingestion of IFC-exported CSV files, enabling lightweight and flexible integration into our spatial database backend.
Folder Structure Convention
π·οΈ Replace {SITE-NAME} with your site name (e.g., yap_data, navy_data, citywest_data).
ποΈ The folder House_11 is a custom name and can be anything β you may create separate folders for each building or model (e.g., Building_A, Garage_01, etc.). Inside each folder, place the corresponding IFC component CSVs (e.g., Wall.csv, Slab.csv, Door.csv, etc.).
Example Folder Structure
./ingestion_data/
βββ {SITE-NAME}/
βββ ingestion/
βββ tables/
βββ IFC/
βββ House_11/
βββ Header.csv
βββ Wall.csv
βββ Slab.csv
βββ Door.csv
βββ Window.csv
π IFC CSV Ingestion Command
Docker command to Ingest Single IFC CSV.
π Ingesting Glossary CSV (Searchable Terms & Acronyms)
The glossary is a special type of CSV that powers a dedicated frontend UI for exploring domain-specific terms and acronyms.
Once ingested, users can:
- π Search by term or acronym
- π View full definitions
- π§© Link glossary terms with spatial or tabular data
Example Folder Structure
For example:
./ingestion_data/
βββ {SITE-NAME}/
βββ ingestion/
βββ tables/
βββ Glossary/
βββGlossary_NTRP_Acronyms_Planning.csv
π Glossary CSV Ingestion Command
Docker command to Ingest Glossary Data.
Ingesting Vanilla CSVs (Tabular Data)
Twinspace Mirror supports ingestion of general-purpose tabular CSV files. These may include datasets such as project plans, task assignments, schedules, or any structured flat file.
Example Folder Structure
./ingestion_data/
βββ {SITE-NAME}/
βββ ingestion/
βββ tables/
βββ Non_Spatial_CSV/
βββ Project.csv
βββ Task.csv
Folder Structure Convention
π·οΈ Replace {SITE-NAME} with your actual site name (e.g., yap_data, navy_data, citywest_data).
ποΈ You may organize multiple CSVs under Non_Spatial_CSV/ β such as
Project.csv,
Task.csv,
Budget.csv, etc.
Each file will be mapped to its own table in the database.
π Vanilla CSV Ingestion Command
If you have multiple CSV files inside a folder (e.g., Project.csv, Task.csv, Budget.csv, etc.), you can ingest them all at once using the following command:
Docker command to Ingest CSV folder.
Table Name Argument
The second argument (projects) is the database table name that will be created or updated.
Use --overwrite
This will replace any existing table with the same name.
CSV Format
Ensure your CSV uses a proper header row, and valid delimiter (usually ,). Invalid files may cause ingestion failure.
βοΈ Important Notes for CLI Ingestion
Overwrite vs Append
If the target table already exists, use the --overwrite flag to drop and recreate it.
If you omit --overwrite, the new data will be appended to the existing table.
Column Data Types
During CSV ingestion, the CLI will prompt you to choose a data type for each column.
- You will see a numbered list (e.g., 1: text, 2: bigint, etc.)
- If your desired data type is not listed, enter 0 to manually type it in (e.g., numeric, uuid, etc.)
EPSG Code for Geometry
You will be prompted to enter an EPSG code for geometry columns.
- Default is 4326 (WGS 84)
- Enter another code only if your spatial data uses a different CRS
Geometry Type Selection
You will be asked to choose a geometry type:
- 1: Point
- 2: Line
- 3: Polygon
Choose the one that matches your spatial dataβs structure.
π Get Help
To see available CLI options for any script, you can run:
ποΈ Filebrowser (Web-Based File Manager)
Filebrowser is a lightweight web-based file management interface that allows you to browse, upload, delete, and manage files inside the project folders β all from a clean browser UI.
π― Purpose
To easily navigate and manage project assets, including:
- Geospatial files (e.g., Shapefiles, GeoTIFFs) stored in
ingestion_data - X3D assets, PDFs, and screenshots stored in
webapp/media
This is especially helpful for debugging, manual uploads, or reviewing asset structure.
βοΈ Setup & Configuration
- Docker Service Name:
citymap-filebrowser - Host:
localhost - Port:
8083 - Base URL:
/files
Accessible at:
Connected to the shared citymap-network.
π€ Default Credentials
- Username:
admin - Password:
admin
β οΈ It is strongly recommended to change the default credentials in a production environment.
π Mounted Directories
| Host Path | Container Path | Description |
|---|---|---|
./../ingestion_data |
/srv/ingestion_data |
Stores uploaded geospatial and asset files |
./../webapp/media |
/srv/media |
Stores frontend X3D assets, screenshots, PDFs |
π Notes on Security
- Filebrowser is not password protected beyond the default credentials. For production, configure secure credentials or disable the service entirely in public-facing deployments.
- You can manage users and roles from the Filebrowser UI or a
.filebrowser.jsonconfig.
π οΈ Common Use Cases
- Uploading new data assets
- Browsing ingested files without needing shell access
- Debugging issues with missing or malformed files
- Quick visual review of generated files (e.g., PDF reports, 3D models)