This document describes the prerequisites, best practices, and common errors when working with datasets.
Prerequisites
When creating a dataset:
- Display names must be unique within your Google Cloud project.
- Display names must be less than 64 bytes (Because these characters are represented in UTF-8, in some languages each character can be represented by multiple bytes).
- Descriptions must be less than 1000 bytes.
When uploading data:
- The supported file types are CSV, GeoJSON, and KML.
- The maximum supported file size is 500 MB.
- Attribute column names cannot begin with the string "?_".
- Three-dimensional geometries are not supported. This includes the "Z" suffix in the WKT format, and the altitude coordinate in the GeoJSON format.
Data preparation best practices
If your source data is complex or large, such as dense points, long linestrings or polygons (often source file sizes larger than 50 MB fall into this category), consider simplifying your data before uploading to achieve the best performance in a visual map.
Here are some best practices for preparing your data:
- Minimize feature properties. Only keep feature properties needed to style your map, for example "id" and "category". You can join additional properties to a feature in a client application using data-driven styles on a unique identifier key. For example, see See your data in real time with Data-driven styling.
- Use simple data types for property objects where possible, such as integers, to minimize tile size and improve map performance.
- Simplify complex geometries prior to uploading a file. You can do this in a geospatial tool of your choice, such as the open source Mapshaper.org utility, or in BigQuery using ST_Simplify on complex polygon geometries.
- Cluster very dense points prior to uploading a file. You can do this in a geospatial tool of your choice, such as the open source turf.js cluster functions, or in BigQuery using ST_CLUSTERDBSCAN on dense point geometries.
See additional guidance about datasets best practices in Visualize your data with Datasets and BigQuery.
GeoJSON requirements
Maps Datasets API supports the current GeoJSON specification. Maps Datasets API also support GeoJSON files that contain any of the following object types:
- Geometry objects. A geometry object is a spatial shape, described as a union of points, lines, and polygons with optional holes.
- Feature objects. A feature object contains a geometry plus additional name/value pairs, whose meaning is application-specific.
- Feature collections. A feature collection is a set of feature objects.
Maps Datasets API does not support GeoJSON files that have data in a coordinate reference system (CRS) other than WGS84.
For more information on GeoJSON, see RFC 7946 compliant.
KML requirements
Maps Datasets API has the following requirements:
- All URLs must be local (or relative) to the file itself.
- Point, line, and polygon geometries supported.
- All data attributes are considered strings.
- Icons or
<styleUrl>
defined outside of the file. - Network links, such as
<NetworkLink>
- Ground overlays, such as
<GroundOverlay>
- 3D geometries or any altitude-related tags such as
<altitudeMode>
- Camera specifications such as
<LookAt>
- Styles defined inside the KML file.
CSV requirements
For CSV files, the supported column names are listed below in order of priority:
latitude
,longitude
lat
,long
x
,y
wkt
(Well-Known Text)address
,city
,state
,zip
address
- A single column containing all address information, such as
1600 Amphitheatre Parkway Mountain View, CA 94043
For example, your file contains columns named x
, y
, and wkt
.
Because x
and y
have a higher priority, as determined by the order of
supported column names in the list above, the values in the x
and y
columns
are used and the wkt
column is ignored.
In addition:
- Each column name must belong to a single column. That is, you cannot have a column named
xy
that contains both x and y coordinate data. The x and y coordinates must be in separate columns. - Column names are case-insensitive.
- The order of the column names does not matter. For example, if your CSV file contains
lat
andlong
columns, they can occur in any order.
Handle data upload errors
When uploading data to a dataset, you might experience one of the common errors described in this section.
GeoJSON errors
Common GeoJSON errors include:
- Missing
type
field, or thetype
is not a string. The uploaded GeoJSON data file must contain a string field namedtype
as part of each Feature object and Geometry object definition.
KML errors
Common KML errors include:
- The data file must not contain any of the unsupported KML features listed above, otherwise the data import might fail.
CSV errors
Common CSV errors include:
- Some rows are missing values for a geometry column. All rows in a CSV file must contain
non-empty values for the geometry columns. The geometry columns include:
latitude
,longitude
lat
,long
x
,y
wkt
address
,city
,state
,zip
address
- A single column containing all address information, such as
1600 Amphitheatre Parkway Mountain View, CA 94043
- If
x
andy
are your geometry columns, ensure that the units are longitude and latitude. Some public datasets use different coordinate systems under the headersx
andy
. If the wrong units are used, the dataset might import successfully, but the rendered data can show the dataset points in unexpected locations.